<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head><meta http-equiv="content-type" content="text/html; charset=utf-8" />
<title>[197426] trunk</title>
</head>
<body>

<style type="text/css"><!--
#msg dl.meta { border: 1px #006 solid; background: #369; padding: 6px; color: #fff; }
#msg dl.meta dt { float: left; width: 6em; font-weight: bold; }
#msg dt:after { content:':';}
#msg dl, #msg dt, #msg ul, #msg li, #header, #footer, #logmsg { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt;  }
#msg dl a { font-weight: bold}
#msg dl a:link    { color:#fc3; }
#msg dl a:active  { color:#ff0; }
#msg dl a:visited { color:#cc6; }
h3 { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt; font-weight: bold; }
#msg pre { overflow: auto; background: #ffc; border: 1px #fa0 solid; padding: 6px; }
#logmsg { background: #ffc; border: 1px #fa0 solid; padding: 1em 1em 0 1em; }
#logmsg p, #logmsg pre, #logmsg blockquote { margin: 0 0 1em 0; }
#logmsg p, #logmsg li, #logmsg dt, #logmsg dd { line-height: 14pt; }
#logmsg h1, #logmsg h2, #logmsg h3, #logmsg h4, #logmsg h5, #logmsg h6 { margin: .5em 0; }
#logmsg h1:first-child, #logmsg h2:first-child, #logmsg h3:first-child, #logmsg h4:first-child, #logmsg h5:first-child, #logmsg h6:first-child { margin-top: 0; }
#logmsg ul, #logmsg ol { padding: 0; list-style-position: inside; margin: 0 0 0 1em; }
#logmsg ul { text-indent: -1em; padding-left: 1em; }#logmsg ol { text-indent: -1.5em; padding-left: 1.5em; }
#logmsg > ul, #logmsg > ol { margin: 0 0 1em 0; }
#logmsg pre { background: #eee; padding: 1em; }
#logmsg blockquote { border: 1px solid #fa0; border-left-width: 10px; padding: 1em 1em 0 1em; background: white;}
#logmsg dl { margin: 0; }
#logmsg dt { font-weight: bold; }
#logmsg dd { margin: 0; padding: 0 0 0.5em 0; }
#logmsg dd:before { content:'\00bb';}
#logmsg table { border-spacing: 0px; border-collapse: collapse; border-top: 4px solid #fa0; border-bottom: 1px solid #fa0; background: #fff; }
#logmsg table th { text-align: left; font-weight: normal; padding: 0.2em 0.5em; border-top: 1px dotted #fa0; }
#logmsg table td { text-align: right; border-top: 1px dotted #fa0; padding: 0.2em 0.5em; }
#logmsg table thead th { text-align: center; border-bottom: 1px solid #fa0; }
#logmsg table th.Corner { text-align: left; }
#logmsg hr { border: none 0; border-top: 2px dashed #fa0; height: 1px; }
#header, #footer { color: #fff; background: #636; border: 1px #300 solid; padding: 6px; }
#patch { width: 100%; }
#patch h4 {font-family: verdana,arial,helvetica,sans-serif;font-size:10pt;padding:8px;background:#369;color:#fff;margin:0;}
#patch .propset h4, #patch .binary h4 {margin:0;}
#patch pre {padding:0;line-height:1.2em;margin:0;}
#patch .diff {width:100%;background:#eee;padding: 0 0 10px 0;overflow:auto;}
#patch .propset .diff, #patch .binary .diff  {padding:10px 0;}
#patch span {display:block;padding:0 10px;}
#patch .modfile, #patch .addfile, #patch .delfile, #patch .propset, #patch .binary, #patch .copfile {border:1px solid #ccc;margin:10px 0;}
#patch ins {background:#dfd;text-decoration:none;display:block;padding:0 10px;}
#patch del {background:#fdd;text-decoration:none;display:block;padding:0 10px;}
#patch .lines, .info {color:#888;background:#fff;}
--></style>
<div id="msg">
<dl class="meta">
<dt>Revision</dt> <dd><a href="http://trac.webkit.org/projects/webkit/changeset/197426">197426</a></dd>
<dt>Author</dt> <dd>msaboff@apple.com</dd>
<dt>Date</dt> <dd>2016-03-01 16:39:01 -0800 (Tue, 01 Mar 2016)</dd>
</dl>

<h3>Log Message</h3>
<pre>[ES6] Add support for Unicode regular expressions
https://bugs.webkit.org/show_bug.cgi?id=154842

Reviewed by Filip Pizlo.

Source/JavaScriptCore:

Added processing of Unicode regular expressions to the Yarr interpreter.

Changed parsing of regular expression patterns and PatternTerms to process characters as
UChar32 in the Yarr code.  The parser converts matched surrogate pairs into the appropriate
Unicode character when the expression is parsed.  When matching a unicode expression and
reading source characters, we convert proper surrogate pair into a Unicode character and
advance the source cursor, &quot;pos&quot;, one more position.  The exception to this is when we
know when generating a fixed character atom that we need to match a unicode character
that doesn't fit in 16 bits.  The code calls this an extendedUnicodeCharacter and has a
helper to determine this.

Added 'u' flag and 'unicode' identifier to regular expression classes.  Added an &quot;isUnicode&quot;
parameter to YarrPattern pattern() and internal users of that function.

Updated the generation of the canonicalization tables to include a new set a tables that
follow the ES 6.0, 21.2.2.8.2 Step 2.  Renamed the YarrCanonicalizeUCS2.* files to
YarrCanonicalizeUnicode.*. 

Added a new Layout/js test that tests the added functionality.  Updated other tests that
have minor es6 unicode checks and look for valid flags.

Ran the ChakraCore Unicode regular expression tests as well.

* CMakeLists.txt:
* JavaScriptCore.vcxproj/JavaScriptCore.vcxproj:
* JavaScriptCore.vcxproj/JavaScriptCore.vcxproj.filters:
* JavaScriptCore.xcodeproj/project.pbxproj:

* inspector/ContentSearchUtilities.cpp:
(Inspector::ContentSearchUtilities::findMagicComment):
* yarr/RegularExpression.cpp:
(JSC::Yarr::RegularExpression::Private::compile):
Updated use of pattern().

* runtime/CommonIdentifiers.h:
* runtime/RegExp.cpp:
(JSC::regExpFlags):
(JSC::RegExpFunctionalTestCollector::outputOneTest):
(JSC::RegExp::finishCreation):
(JSC::RegExp::compile):
(JSC::RegExp::compileMatchOnly):
* runtime/RegExp.h:
* runtime/RegExpKey.h:
* runtime/RegExpPrototype.cpp:
(JSC::regExpProtoFuncCompile):
(JSC::flagsString):
(JSC::regExpProtoGetterMultiline):
(JSC::regExpProtoGetterUnicode):
(JSC::regExpProtoGetterFlags):
Updated for new 'y' (unicode) flag.  Add check to use the interpreter for unicode regular expressions.

* tests/es6.yaml:
* tests/stress/static-getter-in-names.js:
Updated tests for new flag and for passing the minimal es6 regular expression processing.

* yarr/Yarr.h: Updated the size of information now kept for backtracking.

* yarr/YarrCanonicalizeUCS2.cpp: Removed.
* yarr/YarrCanonicalizeUCS2.h: Removed.
* yarr/YarrCanonicalizeUCS2.js: Removed.
* yarr/YarrCanonicalizeUnicode.cpp: Copied from Source/JavaScriptCore/yarr/YarrCanonicalizeUCS2.cpp.
* yarr/YarrCanonicalizeUnicode.h: Copied from Source/JavaScriptCore/yarr/YarrCanonicalizeUCS2.h.
(JSC::Yarr::canonicalCharacterSetInfo):
(JSC::Yarr::canonicalRangeInfoFor):
(JSC::Yarr::getCanonicalPair):
(JSC::Yarr::isCanonicallyUnique):
(JSC::Yarr::areCanonicallyEquivalent):
(JSC::Yarr::rangeInfoFor): Deleted.
* yarr/YarrCanonicalizeUnicode.js: Copied from Source/JavaScriptCore/yarr/YarrCanonicalizeUCS2.js.
(printHeader):
(printFooter):
(hex):
(canonicalize):
(canonicalizeUnicode):
(createUCS2CanonicalGroups):
(createUnicodeCanonicalGroups):
(cu.in.groupedCanonically.characters.sort): Deleted.
(cu.in.groupedCanonically.else): Deleted.
Refactored to output two sets of tables, one for UCS2 and one for Unicode.  The UCS2 tables follow
the legacy canonicalization rules now specified in ES 6.0, 21.2.2.8.2 Step 3.  The new Unicode
tables follow the rules specified in ES 6.0, 21.2.2.8.2 Step 2.  Eliminated the unused Latin1 tables.

* yarr/YarrInterpreter.cpp:
(JSC::Yarr::Interpreter::InputStream::InputStream):
(JSC::Yarr::Interpreter::InputStream::readChecked):
(JSC::Yarr::Interpreter::InputStream::readSurrogatePairChecked):
(JSC::Yarr::Interpreter::InputStream::reread):
(JSC::Yarr::Interpreter::InputStream::prev):
(JSC::Yarr::Interpreter::testCharacterClass):
(JSC::Yarr::Interpreter::checkCharacter):
(JSC::Yarr::Interpreter::checkSurrogatePair):
(JSC::Yarr::Interpreter::checkCasedCharacter):
(JSC::Yarr::Interpreter::tryConsumeBackReference):
(JSC::Yarr::Interpreter::backtrackPatternCharacter):
(JSC::Yarr::Interpreter::matchCharacterClass):
(JSC::Yarr::Interpreter::backtrackCharacterClass):
(JSC::Yarr::Interpreter::matchParenthesesTerminalEnd):
(JSC::Yarr::Interpreter::matchDisjunction):
(JSC::Yarr::Interpreter::Interpreter):
(JSC::Yarr::ByteCompiler::assertionWordBoundary):
(JSC::Yarr::ByteCompiler::atomPatternCharacter):
* yarr/YarrInterpreter.h:
(JSC::Yarr::ByteTerm::ByteTerm):
(JSC::Yarr::BytecodePattern::BytecodePattern):
* yarr/YarrJIT.cpp:
(JSC::Yarr::YarrGenerator::optimizeAlternative):
(JSC::Yarr::YarrGenerator::matchCharacterClassRange):
(JSC::Yarr::YarrGenerator::matchCharacterClass):
(JSC::Yarr::YarrGenerator::notAtEndOfInput):
(JSC::Yarr::YarrGenerator::jumpIfCharNotEquals):
(JSC::Yarr::YarrGenerator::generatePatternCharacterOnce):
(JSC::Yarr::YarrGenerator::generatePatternCharacterFixed):
(JSC::Yarr::YarrGenerator::generatePatternCharacterGreedy):
(JSC::Yarr::YarrGenerator::backtrackPatternCharacterNonGreedy):
* yarr/YarrParser.h:
(JSC::Yarr::Parser::CharacterClassParserDelegate::atomPatternCharacter):
(JSC::Yarr::Parser::Parser):
(JSC::Yarr::Parser::parseEscape):
(JSC::Yarr::Parser::consumePossibleSurrogatePair):
(JSC::Yarr::Parser::parseCharacterClass):
(JSC::Yarr::Parser::parseTokens):
(JSC::Yarr::Parser::parse):
(JSC::Yarr::Parser::atEndOfPattern):
(JSC::Yarr::Parser::patternRemaining):
(JSC::Yarr::Parser::peek):
(JSC::Yarr::parse):
* yarr/YarrPattern.cpp:
(JSC::Yarr::CharacterClassConstructor::CharacterClassConstructor):
(JSC::Yarr::CharacterClassConstructor::append):
(JSC::Yarr::CharacterClassConstructor::putChar):
(JSC::Yarr::CharacterClassConstructor::putUnicodeIgnoreCase):
(JSC::Yarr::CharacterClassConstructor::putRange):
(JSC::Yarr::CharacterClassConstructor::charClass):
(JSC::Yarr::CharacterClassConstructor::addSorted):
(JSC::Yarr::CharacterClassConstructor::addSortedRange):
(JSC::Yarr::YarrPatternConstructor::YarrPatternConstructor):
(JSC::Yarr::YarrPatternConstructor::assertionWordBoundary):
(JSC::Yarr::YarrPatternConstructor::atomPatternCharacter):
(JSC::Yarr::YarrPatternConstructor::atomCharacterClassBegin):
(JSC::Yarr::YarrPatternConstructor::atomCharacterClassAtom):
(JSC::Yarr::YarrPatternConstructor::atomCharacterClassRange):
(JSC::Yarr::YarrPatternConstructor::setupAlternativeOffsets):
(JSC::Yarr::YarrPattern::compile):
(JSC::Yarr::YarrPattern::YarrPattern):
* yarr/YarrPattern.h:
(JSC::Yarr::CharacterRange::CharacterRange):
(JSC::Yarr::CharacterClass::CharacterClass):
(JSC::Yarr::PatternTerm::PatternTerm):
(JSC::Yarr::YarrPattern::reset):
* yarr/YarrSyntaxChecker.cpp:
(JSC::Yarr::SyntaxChecker::assertionBOL):
(JSC::Yarr::SyntaxChecker::assertionEOL):
(JSC::Yarr::SyntaxChecker::assertionWordBoundary):
(JSC::Yarr::SyntaxChecker::atomPatternCharacter):
(JSC::Yarr::SyntaxChecker::atomBuiltInCharacterClass):
(JSC::Yarr::SyntaxChecker::atomCharacterClassBegin):
(JSC::Yarr::SyntaxChecker::atomCharacterClassAtom):
(JSC::Yarr::checkSyntax):

LayoutTests:

Added a new test for the added unicode regular expression processing.

Updated several tests for the y flag changes and &quot;unicode&quot; property.

* js/regexp-unicode-expected.txt: Added.
* js/regexp-unicode.html: Added.
* js/script-tests/regexp-unicode.js: Added.
New test.

* js/Object-getOwnPropertyNames-expected.txt:
* js/regexp-flags-expected.txt:
* js/script-tests/Object-getOwnPropertyNames.js:
* js/script-tests/regexp-flags.js:
(RegExp.prototype.hasOwnProperty):
Updated tests.</pre>

<h3>Modified Paths</h3>
<ul>
<li><a href="#trunkLayoutTestsChangeLog">trunk/LayoutTests/ChangeLog</a></li>
<li><a href="#trunkLayoutTestsjsObjectgetOwnPropertyNamesexpectedtxt">trunk/LayoutTests/js/Object-getOwnPropertyNames-expected.txt</a></li>
<li><a href="#trunkLayoutTestsjsregexpflagsexpectedtxt">trunk/LayoutTests/js/regexp-flags-expected.txt</a></li>
<li><a href="#trunkLayoutTestsjsscripttestsObjectgetOwnPropertyNamesjs">trunk/LayoutTests/js/script-tests/Object-getOwnPropertyNames.js</a></li>
<li><a href="#trunkLayoutTestsjsscripttestsregexpflagsjs">trunk/LayoutTests/js/script-tests/regexp-flags.js</a></li>
<li><a href="#trunkSourceJavaScriptCoreCMakeListstxt">trunk/Source/JavaScriptCore/CMakeLists.txt</a></li>
<li><a href="#trunkSourceJavaScriptCoreChangeLog">trunk/Source/JavaScriptCore/ChangeLog</a></li>
<li><a href="#trunkSourceJavaScriptCoreJavaScriptCorevcxprojJavaScriptCorevcxproj">trunk/Source/JavaScriptCore/JavaScriptCore.vcxproj/JavaScriptCore.vcxproj</a></li>
<li><a href="#trunkSourceJavaScriptCoreJavaScriptCorevcxprojJavaScriptCorevcxprojfilters">trunk/Source/JavaScriptCore/JavaScriptCore.vcxproj/JavaScriptCore.vcxproj.filters</a></li>
<li><a href="#trunkSourceJavaScriptCoreJavaScriptCorexcodeprojprojectpbxproj">trunk/Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj</a></li>
<li><a href="#trunkSourceJavaScriptCoreinspectorContentSearchUtilitiescpp">trunk/Source/JavaScriptCore/inspector/ContentSearchUtilities.cpp</a></li>
<li><a href="#trunkSourceJavaScriptCoreruntimeCommonIdentifiersh">trunk/Source/JavaScriptCore/runtime/CommonIdentifiers.h</a></li>
<li><a href="#trunkSourceJavaScriptCoreruntimeRegExpcpp">trunk/Source/JavaScriptCore/runtime/RegExp.cpp</a></li>
<li><a href="#trunkSourceJavaScriptCoreruntimeRegExph">trunk/Source/JavaScriptCore/runtime/RegExp.h</a></li>
<li><a href="#trunkSourceJavaScriptCoreruntimeRegExpKeyh">trunk/Source/JavaScriptCore/runtime/RegExpKey.h</a></li>
<li><a href="#trunkSourceJavaScriptCoreruntimeRegExpPrototypecpp">trunk/Source/JavaScriptCore/runtime/RegExpPrototype.cpp</a></li>
<li><a href="#trunkSourceJavaScriptCoretestses6yaml">trunk/Source/JavaScriptCore/tests/es6.yaml</a></li>
<li><a href="#trunkSourceJavaScriptCoretestsstressstaticgetterinnamesjs">trunk/Source/JavaScriptCore/tests/stress/static-getter-in-names.js</a></li>
<li><a href="#trunkSourceJavaScriptCoreyarrRegularExpressioncpp">trunk/Source/JavaScriptCore/yarr/RegularExpression.cpp</a></li>
<li><a href="#trunkSourceJavaScriptCoreyarrYarrh">trunk/Source/JavaScriptCore/yarr/Yarr.h</a></li>
<li><a href="#trunkSourceJavaScriptCoreyarrYarrInterpretercpp">trunk/Source/JavaScriptCore/yarr/YarrInterpreter.cpp</a></li>
<li><a href="#trunkSourceJavaScriptCoreyarrYarrInterpreterh">trunk/Source/JavaScriptCore/yarr/YarrInterpreter.h</a></li>
<li><a href="#trunkSourceJavaScriptCoreyarrYarrJITcpp">trunk/Source/JavaScriptCore/yarr/YarrJIT.cpp</a></li>
<li><a href="#trunkSourceJavaScriptCoreyarrYarrParserh">trunk/Source/JavaScriptCore/yarr/YarrParser.h</a></li>
<li><a href="#trunkSourceJavaScriptCoreyarrYarrPatterncpp">trunk/Source/JavaScriptCore/yarr/YarrPattern.cpp</a></li>
<li><a href="#trunkSourceJavaScriptCoreyarrYarrPatternh">trunk/Source/JavaScriptCore/yarr/YarrPattern.h</a></li>
<li><a href="#trunkSourceJavaScriptCoreyarrYarrSyntaxCheckercpp">trunk/Source/JavaScriptCore/yarr/YarrSyntaxChecker.cpp</a></li>
</ul>

<h3>Added Paths</h3>
<ul>
<li><a href="#trunkLayoutTestsjsregexpunicodeexpectedtxt">trunk/LayoutTests/js/regexp-unicode-expected.txt</a></li>
<li><a href="#trunkLayoutTestsjsregexpunicodehtml">trunk/LayoutTests/js/regexp-unicode.html</a></li>
<li><a href="#trunkLayoutTestsjsscripttestsregexpunicodejs">trunk/LayoutTests/js/script-tests/regexp-unicode.js</a></li>
<li><a href="#trunkSourceJavaScriptCoreyarrYarrCanonicalizeUnicodecpp">trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.cpp</a></li>
<li><a href="#trunkSourceJavaScriptCoreyarrYarrCanonicalizeUnicodeh">trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.h</a></li>
<li><a href="#trunkSourceJavaScriptCoreyarrYarrCanonicalizeUnicodejs">trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.js</a></li>
</ul>

<h3>Removed Paths</h3>
<ul>
<li><a href="#trunkSourceJavaScriptCoreyarrYarrCanonicalizeUCS2cpp">trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUCS2.cpp</a></li>
<li><a href="#trunkSourceJavaScriptCoreyarrYarrCanonicalizeUCS2h">trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUCS2.h</a></li>
<li><a href="#trunkSourceJavaScriptCoreyarrYarrCanonicalizeUCS2js">trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUCS2.js</a></li>
</ul>

</div>
<div id="patch">
<h3>Diff</h3>
<a id="trunkLayoutTestsChangeLog"></a>
<div class="modfile"><h4>Modified: trunk/LayoutTests/ChangeLog (197425 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/LayoutTests/ChangeLog        2016-03-02 00:31:47 UTC (rev 197425)
+++ trunk/LayoutTests/ChangeLog        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -1,3 +1,26 @@
</span><ins>+2016-03-01  Michael Saboff  &lt;msaboff@apple.com&gt;
+
+        [ES6] Add support for Unicode regular expressions
+        https://bugs.webkit.org/show_bug.cgi?id=154842
+
+        Reviewed by Filip Pizlo.
+
+        Added a new test for the added unicode regular expression processing.
+
+        Updated several tests for the y flag changes and &quot;unicode&quot; property.
+
+        * js/regexp-unicode-expected.txt: Added.
+        * js/regexp-unicode.html: Added.
+        * js/script-tests/regexp-unicode.js: Added.
+        New test.
+
+        * js/Object-getOwnPropertyNames-expected.txt:
+        * js/regexp-flags-expected.txt:
+        * js/script-tests/Object-getOwnPropertyNames.js:
+        * js/script-tests/regexp-flags.js:
+        (RegExp.prototype.hasOwnProperty):
+        Updated tests.
+
</ins><span class="cx"> 2016-03-01  Ryan Haddad  &lt;ryanhaddad@apple.com&gt;
</span><span class="cx"> 
</span><span class="cx">         Marking fast/text/crash-complex-text-surrogate.html as flaky on mac
</span></span></pre></div>
<a id="trunkLayoutTestsjsObjectgetOwnPropertyNamesexpectedtxt"></a>
<div class="modfile"><h4>Modified: trunk/LayoutTests/js/Object-getOwnPropertyNames-expected.txt (197425 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/LayoutTests/js/Object-getOwnPropertyNames-expected.txt        2016-03-02 00:31:47 UTC (rev 197425)
+++ trunk/LayoutTests/js/Object-getOwnPropertyNames-expected.txt        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -56,7 +56,7 @@
</span><span class="cx"> PASS getSortedOwnPropertyNames(Date) is ['UTC', 'length', 'name', 'now', 'parse', 'prototype']
</span><span class="cx"> PASS getSortedOwnPropertyNames(Date.prototype) is ['constructor', 'getDate', 'getDay', 'getFullYear', 'getHours', 'getMilliseconds', 'getMinutes', 'getMonth', 'getSeconds', 'getTime', 'getTimezoneOffset', 'getUTCDate', 'getUTCDay', 'getUTCFullYear', 'getUTCHours', 'getUTCMilliseconds', 'getUTCMinutes', 'getUTCMonth', 'getUTCSeconds', 'getYear', 'setDate', 'setFullYear', 'setHours', 'setMilliseconds', 'setMinutes', 'setMonth', 'setSeconds', 'setTime', 'setUTCDate', 'setUTCFullYear', 'setUTCHours', 'setUTCMilliseconds', 'setUTCMinutes', 'setUTCMonth', 'setUTCSeconds', 'setYear', 'toDateString', 'toGMTString', 'toISOString', 'toJSON', 'toLocaleDateString', 'toLocaleString', 'toLocaleTimeString', 'toString', 'toTimeString', 'toUTCString', 'valueOf']
</span><span class="cx"> PASS getSortedOwnPropertyNames(RegExp) is ['$&amp;', &quot;$'&quot;, '$*', '$+', '$1', '$2', '$3', '$4', '$5', '$6', '$7', '$8', '$9', '$_', '$`', 'input', 'lastMatch', 'lastParen', 'leftContext', 'length', 'multiline', 'name', 'prototype', 'rightContext']
</span><del>-PASS getSortedOwnPropertyNames(RegExp.prototype) is ['compile', 'constructor', 'exec', 'flags', 'global', 'ignoreCase', 'lastIndex', 'multiline', 'source', 'test', 'toString']
</del><ins>+PASS getSortedOwnPropertyNames(RegExp.prototype) is ['compile', 'constructor', 'exec', 'flags', 'global', 'ignoreCase', 'lastIndex', 'multiline', 'source', 'test', 'toString', 'unicode']
</ins><span class="cx"> PASS getSortedOwnPropertyNames(Error) is ['length', 'name', 'prototype']
</span><span class="cx"> PASS getSortedOwnPropertyNames(Error.prototype) is ['constructor', 'message', 'name', 'toString']
</span><span class="cx"> PASS getSortedOwnPropertyNames(Math) is ['E','LN10','LN2','LOG10E','LOG2E','PI','SQRT1_2','SQRT2','abs','acos','acosh','asin','asinh','atan','atan2','atanh','cbrt','ceil','clz32','cos','cosh','exp','expm1','floor','fround','hypot','imul','log','log10','log1p','log2','max','min','pow','random','round','sign','sin','sinh','sqrt','tan','tanh','trunc']
</span></span></pre></div>
<a id="trunkLayoutTestsjsregexpflagsexpectedtxt"></a>
<div class="modfile"><h4>Modified: trunk/LayoutTests/js/regexp-flags-expected.txt (197425 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/LayoutTests/js/regexp-flags-expected.txt        2016-03-02 00:31:47 UTC (rev 197425)
+++ trunk/LayoutTests/js/regexp-flags-expected.txt        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -23,6 +23,10 @@
</span><span class="cx"> PASS flags.call({global: true, multiline: true, ignoreCase: true}) is 'gim'
</span><span class="cx"> PASS flags.call({global: 1, multiline: 0, ignoreCase: 2}) is 'gi'
</span><span class="cx"> PASS flags.call({ __proto__: { multiline: true } }) is 'm'
</span><ins>+unicode flag
+PASS /a/uimg.flags is 'gimu'
+PASS new RegExp('a', 'uimg').flags is 'gimu'
+PASS flags.call({global: true, multiline: true, ignoreCase: true, unicode: true}) is 'gimu'
</ins><span class="cx"> PASS successfullyParsed is true
</span><span class="cx"> 
</span><span class="cx"> TEST COMPLETE
</span></span></pre></div>
<a id="trunkLayoutTestsjsregexpunicodeexpectedtxt"></a>
<div class="addfile"><h4>Added: trunk/LayoutTests/js/regexp-unicode-expected.txt (0 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/LayoutTests/js/regexp-unicode-expected.txt                                (rev 0)
+++ trunk/LayoutTests/js/regexp-unicode-expected.txt        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -0,0 +1,97 @@
</span><ins>+Test for unicode regular expression processing
+
+On success, you will see a series of &quot;PASS&quot; messages, followed by &quot;TEST COMPLETE&quot;.
+
+
+PASS &quot;a&quot;.match(/a/)[0].length is 1
+PASS &quot;a&quot;.match(/A/i)[0].length is 1
+PASS &quot;a&quot;.match(/a/u)[0].length is 1
+PASS &quot;a&quot;.match(/A/iu)[0].length is 1
+PASS &quot;Ȓ&quot;.match(/Ȓ/)[0].length is 1
+PASS &quot;Ȓ&quot;.match(/Ȓ/u)[0].length is 1
+PASS &quot;ሴ&quot;.match(/ሴ/)[0].length is 1
+PASS &quot;ሴ&quot;.match(/ሴ/u)[0].length is 1
+PASS &quot;⪼&quot;.match(/⪼/)[0].length is 1
+PASS &quot;㿭&quot;.match(/㿭/u)[0].length is 1
+PASS &quot;𒍅&quot;.match(/𒍅/u)[0].length is 2
+PASS &quot;𒍅&quot;.match(/𒍅/u)[0].length is 2
+PASS &quot;𝌆&quot;.match(/𝌆/)[0].length is 2
+PASS /𐑏/u.test(&quot;𐑏&quot;) is true
+PASS /𐑏/u.test(&quot;𐑏&quot;) is true
+PASS &quot;𝌆&quot;.match(/𝌆/u)[0].length is 2
+PASS /(𐀀|𐐀|𐐩)/u.test(&quot;𐐀&quot;) is true
+PASS &quot;𐄣&quot;.match(/a|𐄣|b/u)[0].length is 2
+PASS &quot;b&quot;.match(/a|𐄣|b/u)[0].length is 1
+PASS /(?:a|𐄣|b)x/u.test(&quot;𐄣&quot;) is false
+PASS /(?:a|𐄣|b)x/u.test(&quot;𐄣x&quot;) is true
+PASS /(?:a|𐄣|b)x/u.test(&quot;b&quot;) is false
+PASS /(?:a|𐄣|b)x/u.test(&quot;bx&quot;) is true
+PASS &quot;a𐄣x&quot;.match(/a𐄣b|a𐄣x/u)[0].length is 4
+PASS /(𐀀|𐐀|𐐩)x/ui.test(&quot;𐐀x&quot;) is true
+PASS /(𐀀|𐐀|𐐩)x/ui.test(&quot;𐐩x&quot;) is true
+PASS /(𐀀|𐐀|𐐩)x/ui.test(&quot;𐐁x&quot;) is true
+PASS /(𐀀|𐐀|𐐩)x/ui.test(&quot;𐐨x&quot;) is true
+PASS &quot;𐐩&quot;.match(/a|𐐁|b/iu)[0].length is 2
+PASS &quot;B&quot;.match(/a|𐄣|b/iu)[0].length is 1
+PASS /(?:A|𐄣|b)x/iu.test(&quot;𐄣&quot;) is false
+PASS /(?:A|𐄣|b)x/iu.test(&quot;𐄣x&quot;) is true
+PASS /(?:A|𐄣|b)x/iu.test(&quot;b&quot;) is false
+PASS /(?:A|𐄣|b)x/iu.test(&quot;bx&quot;) is true
+PASS &quot;a𐄣X&quot;.match(/a𐄣b|a𐄣x/iu)[0].length is 4
+PASS &quot;Ťx&quot;.match(/ťx/iu)[0].length is 2
+PASS &quot;𝌆&quot;.match(/^.$/u)[0].length is 2
+PASS &quot;It is 78°&quot;.match(/.*/u)[0].length is 9
+PASS &quot;𝌆&quot;.match(/[𝌆a]/)[0].length is 1
+PASS &quot;𝌆&quot;.match(/[a𝌆]/u)[0].length is 2
+PASS &quot;𝌆&quot;.match(/[𝌆a]/u)[0].length is 2
+PASS &quot;𝌆&quot;.match(/[a-𝌆]/)[0].length is 1
+PASS &quot;𝌆&quot;.match(/[a-𝌆]/u)[0].length is 2
+PASS &quot;X&quot;.match(/[ -𐑏]/u)[0].length is 1
+PASS &quot;က&quot;.match(/[ -𐑏]/u)[0].length is 1
+PASS &quot;𐐧&quot;.match(/[ -𐑏]/u)[0].length is 2
+PASS re1.test(&quot;Z&quot;) is false
+PASS re1.test(&quot;က&quot;) is false
+PASS re1.test(&quot;𐐀&quot;) is false
+PASS re2.test(&quot;A&quot;) is true
+PASS re2.test(&quot;\x{FFFF}&quot;) is false
+PASS re2.test(&quot;𒍅&quot;) is true
+PASS &quot;𐌑𐌑𐌑&quot;.match(/𐌑*a|𐌑*./u)[0] is &quot;𐌑𐌑𐌑&quot;
+PASS &quot;a𐌑𐌑&quot;.match(/a𐌑*?$/u)[0] is &quot;a𐌑𐌑&quot;
+PASS &quot;a𐌑𐌑𐌑c&quot;.match(/a𐌑*cd|a𐌑*c/u)[0] is &quot;a𐌑𐌑𐌑c&quot;
+PASS &quot;a𐌑𐌑𐌑c&quot;.match(/a𐌑+cd|a𐌑+c/u)[0] is &quot;a𐌑𐌑𐌑c&quot;
+PASS &quot;𐌑𐌑𐌑&quot;.match(/𐌑+?a|𐌑+?./u)[0] is &quot;𐌑𐌑&quot;
+PASS &quot;𐌑𐌑𐌑&quot;.match(/𐌑+?a|𐌑+?$/u)[0] is &quot;𐌑𐌑𐌑&quot;
+PASS &quot;a𐌑𐌑𐌑c&quot;.match(/a𐌑*?cd|a𐌑*?c/u)[0] is &quot;a𐌑𐌑𐌑c&quot;
+PASS &quot;a𐌑𐌑𐌑c&quot;.match(/a𐌑+?cd|a𐌑+?c/u)[0] is &quot;a𐌑𐌑𐌑c&quot;
+PASS &quot;𐌑𐌑𐌑&quot;.match(/𐌑+?a|𐌑+?./iu)[0] is &quot;𐌑𐌑&quot;
+PASS &quot;𐐪𐐪𐌑&quot;.match(/𐐂*𐈀|𐐂*𐌑/iu)[0] is &quot;𐐪𐐪𐌑&quot;
+PASS &quot;𐐪𐐪𐌑&quot;.match(/𐐂+𐈀|𐐂+𐌑/iu)[0] is &quot;𐐪𐐪𐌑&quot;
+PASS &quot;𐐪𐐪𐌑&quot;.match(/𐐂*?𐈀|𐐂*?𐌑/iu)[0] is &quot;𐐪𐐪𐌑&quot;
+PASS &quot;𐐪𐐪𐌑&quot;.match(/𐐂+?𐈀|𐐂+?𐌑/iu)[0] is &quot;𐐪𐐪𐌑&quot;
+PASS &quot;ab𐌑c𐨁&quot;.match(/abc|ab𐌑cd|ab𐌑c𐨁d|ab𐌑c𐨁/u)[0] is &quot;ab𐌑c𐨁&quot;
+PASS &quot;ab𐐨c𐨁&quot;.match(/abc|ab𐐀cd|ab𐐀c𐨁d|ab𐐀c𐨁/iu)[0] is &quot;ab𐐨c𐨁&quot;
+PASS /abc|ab𐐀cd|ab𐐀c𐨁d|ab𐐀c𐨁/iu.test(&quot;qwerty123&quot;) is false
+PASS &quot;a𐐨𐐨𐐨c&quot;.match(/ac|a𐐀*cd|a𐐀+cd|a𐐀+c/iu)[0] is &quot;a𐐨𐐨𐐨c&quot;
+PASS &quot;ab𐐨𐐨𐐨c𐨁&quot;.match(/abc|ab𐐀*cd|ab𐐀+c𐨁d|ab𐐀+c𐨁/iu)[0] is &quot;ab𐐨𐐨𐐨c𐨁&quot;
+PASS &quot;ab𐐨𐐨𐐨&quot;.match(/abc|ab𐐨*./u)[0] is &quot;ab𐐨𐐨𐐨&quot;
+PASS &quot;ab𐐨𐐨𐐨&quot;.match(/abc|ab𐐀*./iu)[0] is &quot;ab𐐨𐐨𐐨&quot;
+PASS match3[0] is &quot;a𐐐𐐐b&quot;
+PASS match3[1] is undefined.
+PASS match3[2] is &quot;a𐐐𐐐b&quot;
+PASS match4[0] is &quot;a𐐸𐐸b&quot;
+PASS match4[1] is undefined.
+PASS match4[2] is &quot;𐐸𐐸&quot;
+PASS match5[0] is &quot;a𐐒𐐒b𐐒𐐒&quot;
+PASS match5[1] is undefined.
+PASS match5[2] is &quot;𐐒𐐒&quot;
+PASS match6[0] is &quot;a𐐒𐐒b𐐺𐐒&quot;
+PASS match6[1] is undefined.
+PASS match6[2] is &quot;𐐒𐐒&quot;
+PASS /ẚbc/ui.test(&quot;abc&quot;) is true
+PASS /abc/ui.test(&quot;ẚbc&quot;) is true
+PASS /texẗ/ui.test(&quot;text&quot;) is true
+PASS /text/ui.test(&quot;ẗext&quot;) is true
+PASS successfullyParsed is true
+
+TEST COMPLETE
+
</ins></span></pre></div>
<a id="trunkLayoutTestsjsregexpunicodehtml"></a>
<div class="addfile"><h4>Added: trunk/LayoutTests/js/regexp-unicode.html (0 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/LayoutTests/js/regexp-unicode.html                                (rev 0)
+++ trunk/LayoutTests/js/regexp-unicode.html        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -0,0 +1,10 @@
</span><ins>+&lt;!DOCTYPE HTML PUBLIC &quot;-//IETF//DTD HTML//EN&quot;&gt;
+&lt;html&gt;
+&lt;head&gt;
+&lt;script src=&quot;../resources/js-test-pre.js&quot;&gt;&lt;/script&gt;
+&lt;/head&gt;
+&lt;body&gt;
+&lt;script src=&quot;script-tests/regexp-unicode.js&quot;&gt;&lt;/script&gt;
+&lt;script src=&quot;../resources/js-test-post.js&quot;&gt;&lt;/script&gt;
+&lt;/body&gt;
+&lt;/html&gt;
</ins></span></pre></div>
<a id="trunkLayoutTestsjsscripttestsObjectgetOwnPropertyNamesjs"></a>
<div class="modfile"><h4>Modified: trunk/LayoutTests/js/script-tests/Object-getOwnPropertyNames.js (197425 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/LayoutTests/js/script-tests/Object-getOwnPropertyNames.js        2016-03-02 00:31:47 UTC (rev 197425)
+++ trunk/LayoutTests/js/script-tests/Object-getOwnPropertyNames.js        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -65,7 +65,7 @@
</span><span class="cx">     &quot;Date&quot;: &quot;['UTC', 'length', 'name', 'now', 'parse', 'prototype']&quot;,
</span><span class="cx">     &quot;Date.prototype&quot;: &quot;['constructor', 'getDate', 'getDay', 'getFullYear', 'getHours', 'getMilliseconds', 'getMinutes', 'getMonth', 'getSeconds', 'getTime', 'getTimezoneOffset', 'getUTCDate', 'getUTCDay', 'getUTCFullYear', 'getUTCHours', 'getUTCMilliseconds', 'getUTCMinutes', 'getUTCMonth', 'getUTCSeconds', 'getYear', 'setDate', 'setFullYear', 'setHours', 'setMilliseconds', 'setMinutes', 'setMonth', 'setSeconds', 'setTime', 'setUTCDate', 'setUTCFullYear', 'setUTCHours', 'setUTCMilliseconds', 'setUTCMinutes', 'setUTCMonth', 'setUTCSeconds', 'setYear', 'toDateString', 'toGMTString', 'toISOString', 'toJSON', 'toLocaleDateString', 'toLocaleString', 'toLocaleTimeString', 'toString', 'toTimeString', 'toUTCString', 'valueOf']&quot;,
</span><span class="cx">     &quot;RegExp&quot;: &quot;['$&amp;', \&quot;$'\&quot;, '$*', '$+', '$1', '$2', '$3', '$4', '$5', '$6', '$7', '$8', '$9', '$_', '$`', 'input', 'lastMatch', 'lastParen', 'leftContext', 'length', 'multiline', 'name', 'prototype', 'rightContext']&quot;,
</span><del>-    &quot;RegExp.prototype&quot;: &quot;['compile', 'constructor', 'exec', 'flags', 'global', 'ignoreCase', 'lastIndex', 'multiline', 'source', 'test', 'toString']&quot;,
</del><ins>+    &quot;RegExp.prototype&quot;: &quot;['compile', 'constructor', 'exec', 'flags', 'global', 'ignoreCase', 'lastIndex', 'multiline', 'source', 'test', 'toString', 'unicode']&quot;,
</ins><span class="cx">     &quot;Error&quot;: &quot;['length', 'name', 'prototype']&quot;,
</span><span class="cx">     &quot;Error.prototype&quot;: &quot;['constructor', 'message', 'name', 'toString']&quot;,
</span><span class="cx">     &quot;Math&quot;: &quot;['E','LN10','LN2','LOG10E','LOG2E','PI','SQRT1_2','SQRT2','abs','acos','acosh','asin','asinh','atan','atan2','atanh','cbrt','ceil','clz32','cos','cosh','exp','expm1','floor','fround','hypot','imul','log','log10','log1p','log2','max','min','pow','random','round','sign','sin','sinh','sqrt','tan','tanh','trunc']&quot;,
</span></span></pre></div>
<a id="trunkLayoutTestsjsscripttestsregexpflagsjs"></a>
<div class="modfile"><h4>Modified: trunk/LayoutTests/js/script-tests/regexp-flags.js (197425 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/LayoutTests/js/script-tests/regexp-flags.js        2016-03-02 00:31:47 UTC (rev 197425)
+++ trunk/LayoutTests/js/script-tests/regexp-flags.js        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -28,6 +28,11 @@
</span><span class="cx"> // inherited properties count
</span><span class="cx"> shouldBe(&quot;flags.call({ __proto__: { multiline: true } })&quot;, &quot;'m'&quot;);
</span><span class="cx"> 
</span><ins>+debug(&quot;unicode flag&quot;);
+shouldBe(&quot;/a/uimg.flags&quot;, &quot;'gimu'&quot;);
+shouldBe(&quot;new RegExp('a', 'uimg').flags&quot;, &quot;'gimu'&quot;);
+shouldBe(&quot;flags.call({global: true, multiline: true, ignoreCase: true, unicode: true})&quot;, &quot;'gimu'&quot;);
+
</ins><span class="cx"> if (RegExp.prototype.hasOwnProperty('sticky')) {
</span><span class="cx">   debug(&quot;sticky flag&quot;);
</span><span class="cx">   // when the engine supports &quot;sticky&quot;, these tests will fail by design.
</span><span class="lines">@@ -36,11 +41,3 @@
</span><span class="cx">   shouldBe(&quot;new RegExp('a', 'yimg').flags&quot;, &quot;'gimy'&quot;);
</span><span class="cx">   shouldBe(&quot;flags.call({global: true, multiline: true, ignoreCase: true, sticky: true})&quot;, &quot;'gimy'&quot;);
</span><span class="cx"> }
</span><del>-if (RegExp.prototype.hasOwnProperty('unicode')) {
-  debug(&quot;unicode flag&quot;);
-  // when the engine supports &quot;unicode&quot;, these tests will fail by design.
-  // Hopefully, only the expected output will need updating.
-  shouldBe(&quot;/a/uimg.flags&quot;, &quot;'gimu'&quot;);
-  shouldBe(&quot;new RegExp('a', 'uimg').flags&quot;, &quot;'gimu'&quot;);
-  shouldBe(&quot;flags.call({global: true, multiline: true, ignoreCase: true, unicode: true})&quot;, &quot;'gimu'&quot;);
-}
</del></span></pre></div>
<a id="trunkLayoutTestsjsscripttestsregexpunicodejs"></a>
<div class="addfile"><h4>Added: trunk/LayoutTests/js/script-tests/regexp-unicode.js (0 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/LayoutTests/js/script-tests/regexp-unicode.js                                (rev 0)
+++ trunk/LayoutTests/js/script-tests/regexp-unicode.js        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -0,0 +1,142 @@
</span><ins>+description(
+'Test for unicode regular expression processing'
+);
+
+// Test \u{} escapes in a regular expression
+shouldBe('&quot;a&quot;.match(/\u{61}/)[0].length', '1');
+shouldBe('&quot;a&quot;.match(/\u{41}/i)[0].length', '1');
+shouldBe('&quot;a&quot;.match(/\u{061}/u)[0].length', '1');
+shouldBe('&quot;a&quot;.match(/\u{041}/iu)[0].length', '1');
+shouldBe('&quot;\u{212}&quot;.match(/\u{212}/)[0].length', '1');
+shouldBe('&quot;\u{212}&quot;.match(/\u{0212}/u)[0].length', '1');
+shouldBe('&quot;\u{1234}&quot;.match(/\u{1234}/)[0].length', '1');
+shouldBe('&quot;\u{1234}&quot;.match(/\u{01234}/u)[0].length', '1');
+shouldBe('&quot;\u{2abc}&quot;.match(/\u{2abc}/)[0].length', '1');
+shouldBe('&quot;\u{03fed}&quot;.match(/\u{03fed}/u)[0].length', '1');
+shouldBe('&quot;\u{12345}&quot;.match(/\u{12345}/u)[0].length', '2');
+shouldBe('&quot;\u{12345}&quot;.match(/\u{012345}/u)[0].length', '2');
+shouldBe('&quot;\u{1d306}&quot;.match(/\u{1d306}/)[0].length', '2');
+shouldBeTrue('/\u{1044f}/u.test(&quot;\ud801\udc4f&quot;)');
+shouldBeTrue('/\ud801\udc4f/u.test(&quot;\u{1044f}&quot;)');
+
+// Test basic unicode flag processing
+shouldBe('&quot;\u{1d306}&quot;.match(/\u{1d306}/u)[0].length', '2');
+shouldBeTrue('/(\u{10000}|\u{10400}|\u{10429})/u.test(&quot;\u{10400}&quot;)');
+shouldBe('&quot;\u{10123}&quot;.match(/a|\u{10123}|b/u)[0].length', '2');
+shouldBe('&quot;b&quot;.match(/a|\u{10123}|b/u)[0].length', '1');
+shouldBeFalse('/(?:a|\u{10123}|b)x/u.test(&quot;\u{10123}&quot;)');
+shouldBeTrue('/(?:a|\u{10123}|b)x/u.test(&quot;\u{10123}x&quot;)');
+shouldBeFalse('/(?:a|\u{10123}|b)x/u.test(&quot;b&quot;)');
+shouldBeTrue('/(?:a|\u{10123}|b)x/u.test(&quot;bx&quot;)');
+shouldBe('&quot;a\u{10123}x&quot;.match(/a\u{10123}b|a\u{10123}x/u)[0].length', '4');
+
+// Test unicode flag with ignore case
+shouldBeTrue('/(\u{10000}|\u{10400}|\u{10429})x/ui.test(&quot;\u{10400}x&quot;)');
+shouldBeTrue('/(\u{10000}|\u{10400}|\u{10429})x/ui.test(&quot;\u{10429}x&quot;)');
+shouldBeTrue('/(\u{10000}|\u{10400}|\u{10429})x/ui.test(&quot;\u{10401}x&quot;)');
+shouldBeTrue('/(\u{10000}|\u{10400}|\u{10429})x/ui.test(&quot;\u{10428}x&quot;)');
+shouldBe('&quot;\u{10429}&quot;.match(/a|\u{10401}|b/iu)[0].length', '2');
+shouldBe('&quot;B&quot;.match(/a|\u{10123}|b/iu)[0].length', '1');
+shouldBeFalse('/(?:A|\u{10123}|b)x/iu.test(&quot;\u{10123}&quot;)');
+shouldBeTrue('/(?:A|\u{10123}|b)x/iu.test(&quot;\u{10123}x&quot;)');
+shouldBeFalse('/(?:A|\u{10123}|b)x/iu.test(&quot;b&quot;)');
+shouldBeTrue('/(?:A|\u{10123}|b)x/iu.test(&quot;bx&quot;)');
+shouldBe('&quot;a\u{10123}X&quot;.match(/a\u{10123}b|a\u{10123}x/iu)[0].length', '4');
+shouldBe('&quot;\u0164x&quot;.match(/\u0165x/iu)[0].length', '2');
+
+// Test . matches with Unicode flag
+shouldBe('&quot;\u{1D306}&quot;.match(/^.$/u)[0].length', '2');
+shouldBe('&quot;It is 78\u00B0&quot;.match(/.*/u)[0].length', '9');
+// FIXME: These tests are disabled until https://bugs.webkit.org/show_bug.cgi?id=154863 is fixed
+// shouldBe('&quot;\ud801XXX&quot;.match(/.*/u)[0].length', '4'); // We should match a dangling first surrogate as 1 character
+// shouldBe('&quot;X\udfffXX&quot;.match(/.*/u)[0].length', '4'); // We should match a dangling second surrogate as 1 character
+
+// Test character classes with unicode characters with and without unicode flag
+shouldBe('&quot;\u{1d306}&quot;.match(/[\u{1d306}a]/)[0].length', '1');
+shouldBe('&quot;\u{1d306}&quot;.match(/[a\u{1d306}]/u)[0].length', '2');
+shouldBe('&quot;\u{1d306}&quot;.match(/[\u{1d306}a]/u)[0].length', '2');
+shouldBe('&quot;\u{1d306}&quot;.match(/[a-\u{1d306}]/)[0].length', '1');
+shouldBe('&quot;\u{1d306}&quot;.match(/[a-\u{1d306}]/u)[0].length', '2');
+
+// Test a character class that is a range from one UTF16 to a Unicode character
+shouldBe('&quot;X&quot;.match(/[\u0020-\ud801\udc4f]/u)[0].length', '1');
+shouldBe('&quot;\u1000&quot;.match(/[\u0020-\ud801\udc4f]/u)[0].length', '1');
+shouldBe('&quot;\ud801\udc27&quot;.match(/[\u0020-\ud801\udc4f]/u)[0].length', '2');
+
+var re1 = new RegExp(&quot;[^\u0020-\ud801\udc4f]&quot;, &quot;u&quot;);
+shouldBeFalse('re1.test(&quot;Z&quot;)');
+shouldBeFalse('re1.test(&quot;\u{1000}&quot;)');
+shouldBeFalse('re1.test(&quot;\u{10400}&quot;)');
+
+var re2 = new RegExp(&quot;[a-z\u{10000}-\u{15000}]&quot;, &quot;iu&quot;);
+shouldBeTrue('re2.test(&quot;A&quot;)');
+shouldBeFalse('re2.test(&quot;\uffff&quot;)');
+shouldBeTrue('re2.test(&quot;\u{12345}&quot;)');
+
+// Make sure we properly handle dangling surrogates and combined surrogates
+// FIXME: These tests are disabled until https://bugs.webkit.org/show_bug.cgi?id=154863 is fixed
+// shouldBe('/[\u{10c01}\uD803#\uDC01]/u.exec(&quot;\u{10c01}&quot;).toString()', '&quot;\u{10c01}&quot;');
+// shouldBe('/[\uD803\u{10c01}\uDC01]/u.exec(&quot;\u{10c01}&quot;).toString()', '&quot;\u{10c01}&quot;');
+// shouldBe('/[\uD803#\uDC01\u{10c01}]/u.exec(&quot;\u{10c01}&quot;).toString()', '&quot;\u{10c01}&quot;');
+// shouldBe('/[\uD803\uD803\uDC01\uDC01]/u.exec(&quot;\u{10c01}&quot;).toString()', '&quot;\u{10c01}&quot;');
+// shouldBeNull('/[\u{10c01}\uD803#\uDC01]{2}/u.exec(&quot;\u{10c01}&quot;)');
+// shouldBeNull('/[\uD803\u{10c01}\uDC01]{2}/u.exec(&quot;\u{10c01}&quot;)');
+// shouldBeNull('/[\uD803#\uDC01\u{10c01}]{2}/u.exec(&quot;\u{10c01}&quot;)');
+// shouldBeNull('/[\uD803\uD803\uDC01\uDC01]{2}/u.exec(&quot;\u{10c01}&quot;)');
+// shouldBe('/\uD803|\uDC01|\u{10c01}/u.exec(&quot;\u{10c01}&quot;).toString()', '&quot;\u{10c01}&quot;');
+// shouldBe('/\uD803|\uD803\uDC01|\uDC01/u.exec(&quot;\u{10c01}&quot;).toString()', '&quot;\u{10c01}&quot;');
+// shouldBe('/\uD803|\uDC01|\u{10c01}/u.exec(&quot;\u{D803}&quot;).toString()', '&quot;\u{D803}&quot;');
+// shouldBe('/\uD803|\uD803\uDC01|\uDC01/u.exec(&quot;\u{DC01}&quot;).toString()', '&quot;\u{DC01}&quot;');
+// shouldBeNull('/\uD803\u{10c01}/u.exec(&quot;\u{10c01}&quot;)');
+// shouldBeNull('/\uD803\u{10c01}/u.exec(&quot;\uD803&quot;)');
+// shouldBe('&quot;\uD803\u{10c01}&quot;.match(/\uD803\u{10c01}/u)[0].length', '3');
+
+// Check back tracking on partial matches
+shouldBe('&quot;\u{10311}\u{10311}\u{10311}&quot;.match(/\u{10311}*a|\u{10311}*./u)[0]', '&quot;\u{10311}\u{10311}\u{10311}&quot;');
+shouldBe('&quot;a\u{10311}\u{10311}&quot;.match(/a\u{10311}*?$/u)[0]', '&quot;a\u{10311}\u{10311}&quot;');
+shouldBe('&quot;a\u{10311}\u{10311}\u{10311}c&quot;.match(/a\u{10311}*cd|a\u{10311}*c/u)[0]', '&quot;a\u{10311}\u{10311}\u{10311}c&quot;');
+shouldBe('&quot;a\u{10311}\u{10311}\u{10311}c&quot;.match(/a\u{10311}+cd|a\u{10311}+c/u)[0]', '&quot;a\u{10311}\u{10311}\u{10311}c&quot;');
+shouldBe('&quot;\u{10311}\u{10311}\u{10311}&quot;.match(/\u{10311}+?a|\u{10311}+?./u)[0]', '&quot;\u{10311}\u{10311}&quot;');
+shouldBe('&quot;\u{10311}\u{10311}\u{10311}&quot;.match(/\u{10311}+?a|\u{10311}+?$/u)[0]', '&quot;\u{10311}\u{10311}\u{10311}&quot;');
+shouldBe('&quot;a\u{10311}\u{10311}\u{10311}c&quot;.match(/a\u{10311}*?cd|a\u{10311}*?c/u)[0]', '&quot;a\u{10311}\u{10311}\u{10311}c&quot;');
+shouldBe('&quot;a\u{10311}\u{10311}\u{10311}c&quot;.match(/a\u{10311}+?cd|a\u{10311}+?c/u)[0]', '&quot;a\u{10311}\u{10311}\u{10311}c&quot;');
+shouldBe('&quot;\u{10311}\u{10311}\u{10311}&quot;.match(/\u{10311}+?a|\u{10311}+?./iu)[0]', '&quot;\u{10311}\u{10311}&quot;');
+shouldBe('&quot;\u{1042a}\u{1042a}\u{10311}&quot;.match(/\u{10402}*\u{10200}|\u{10402}*\u{10311}/iu)[0]', '&quot;\u{1042a}\u{1042a}\u{10311}&quot;');
+shouldBe('&quot;\u{1042a}\u{1042a}\u{10311}&quot;.match(/\u{10402}+\u{10200}|\u{10402}+\u{10311}/iu)[0]', '&quot;\u{1042a}\u{1042a}\u{10311}&quot;');
+shouldBe('&quot;\u{1042a}\u{1042a}\u{10311}&quot;.match(/\u{10402}*?\u{10200}|\u{10402}*?\u{10311}/iu)[0]', '&quot;\u{1042a}\u{1042a}\u{10311}&quot;');
+shouldBe('&quot;\u{1042a}\u{1042a}\u{10311}&quot;.match(/\u{10402}+?\u{10200}|\u{10402}+?\u{10311}/iu)[0]', '&quot;\u{1042a}\u{1042a}\u{10311}&quot;');
+shouldBe('&quot;ab\u{10311}c\u{10a01}&quot;.match(/abc|ab\u{10311}cd|ab\u{10311}c\u{10a01}d|ab\u{10311}c\u{10a01}/u)[0]', '&quot;ab\u{10311}c\u{10a01}&quot;');
+shouldBe('&quot;ab\u{10428}c\u{10a01}&quot;.match(/abc|ab\u{10400}cd|ab\u{10400}c\u{10a01}d|ab\u{10400}c\u{10a01}/iu)[0]', '&quot;ab\u{10428}c\u{10a01}&quot;');
+shouldBeFalse('/abc|ab\u{10400}cd|ab\u{10400}c\u{10a01}d|ab\u{10400}c\u{10a01}/iu.test(&quot;qwerty123&quot;)');
+shouldBe('&quot;a\u{10428}\u{10428}\u{10428}c&quot;.match(/ac|a\u{10400}*cd|a\u{10400}+cd|a\u{10400}+c/iu)[0]', '&quot;a\u{10428}\u{10428}\u{10428}c&quot;');
+shouldBe('&quot;ab\u{10428}\u{10428}\u{10428}c\u{10a01}&quot;.match(/abc|ab\u{10400}*cd|ab\u{10400}+c\u{10a01}d|ab\u{10400}+c\u{10a01}/iu)[0]', '&quot;ab\u{10428}\u{10428}\u{10428}c\u{10a01}&quot;');
+shouldBe('&quot;ab\u{10428}\u{10428}\u{10428}&quot;.match(/abc|ab\u{10428}*./u)[0]', '&quot;ab\u{10428}\u{10428}\u{10428}&quot;');
+shouldBe('&quot;ab\u{10428}\u{10428}\u{10428}&quot;.match(/abc|ab\u{10400}*./iu)[0]', '&quot;ab\u{10428}\u{10428}\u{10428}&quot;');
+
+var re3 = new RegExp(&quot;(a\u{10410}*bc)|(a\u{10410}*b)&quot;, &quot;u&quot;);
+var match3 = &quot;a\u{10410}\u{10410}b&quot;.match(re3);
+shouldBe('match3[0]', '&quot;a\u{10410}\u{10410}b&quot;');
+shouldBeUndefined('match3[1]');
+shouldBe('match3[2]', '&quot;a\u{10410}\u{10410}b&quot;');
+
+var re4 = new RegExp(&quot;a(\u{10410}*)bc|a(\u{10410}*)b&quot;, &quot;ui&quot;);
+var match4 = &quot;a\u{10438}\u{10438}b&quot;.match(re4);
+shouldBe('match4[0]', '&quot;a\u{10438}\u{10438}b&quot;');
+shouldBeUndefined('match4[1]');
+shouldBe('match4[2]', '&quot;\u{10438}\u{10438}&quot;');
+
+var match5 = &quot;a\u{10412}\u{10412}b\u{10412}\u{10412}&quot;.match(/a(\u{10412}*)bc\1|a(\u{10412}*)b\2/u);
+shouldBe('match5[0]', '&quot;a\u{10412}\u{10412}b\u{10412}\u{10412}&quot;');
+shouldBeUndefined('match5[1]');
+shouldBe('match5[2]', '&quot;\u{10412}\u{10412}&quot;');
+
+var match6 = &quot;a\u{10412}\u{10412}b\u{1043a}\u{10412}\u{1043a}&quot;.match(/a(\u{1043a}*)bc\1|a(\u{1043a}*)b\2/iu);
+shouldBe('match6[0]', '&quot;a\u{10412}\u{10412}b\u{1043a}\u{10412}&quot;');
+shouldBeUndefined('match6[1]');
+shouldBe('match6[2]', '&quot;\u{10412}\u{10412}&quot;');
+
+// Miscellaneous tests
+shouldBeTrue('/\u1e9Abc/ui.test(&quot;abc&quot;)');
+shouldBeTrue('/abc/ui.test(&quot;\u1e9Abc&quot;)');
+shouldBeTrue('/tex\u1e97/ui.test(&quot;text&quot;)');
+shouldBeTrue('/text/ui.test(&quot;\u1e97ext&quot;)');
</ins></span></pre></div>
<a id="trunkSourceJavaScriptCoreCMakeListstxt"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/CMakeLists.txt (197425 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/CMakeLists.txt        2016-03-02 00:31:47 UTC (rev 197425)
+++ trunk/Source/JavaScriptCore/CMakeLists.txt        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -826,7 +826,7 @@
</span><span class="cx">     wasm/WASMReader.cpp
</span><span class="cx"> 
</span><span class="cx">     yarr/RegularExpression.cpp
</span><del>-    yarr/YarrCanonicalizeUCS2.cpp
</del><ins>+    yarr/YarrCanonicalizeUnicode.cpp
</ins><span class="cx">     yarr/YarrInterpreter.cpp
</span><span class="cx">     yarr/YarrJIT.cpp
</span><span class="cx">     yarr/YarrPattern.cpp
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreChangeLog"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/ChangeLog (197425 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/ChangeLog        2016-03-02 00:31:47 UTC (rev 197425)
+++ trunk/Source/JavaScriptCore/ChangeLog        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -1,3 +1,169 @@
</span><ins>+2016-03-01  Michael Saboff  &lt;msaboff@apple.com&gt;
+
+        [ES6] Add support for Unicode regular expressions
+        https://bugs.webkit.org/show_bug.cgi?id=154842
+
+        Reviewed by Filip Pizlo.
+
+        Added processing of Unicode regular expressions to the Yarr interpreter.
+
+        Changed parsing of regular expression patterns and PatternTerms to process characters as
+        UChar32 in the Yarr code.  The parser converts matched surrogate pairs into the appropriate
+        Unicode character when the expression is parsed.  When matching a unicode expression and
+        reading source characters, we convert proper surrogate pair into a Unicode character and
+        advance the source cursor, &quot;pos&quot;, one more position.  The exception to this is when we
+        know when generating a fixed character atom that we need to match a unicode character
+        that doesn't fit in 16 bits.  The code calls this an extendedUnicodeCharacter and has a
+        helper to determine this.
+
+        Added 'u' flag and 'unicode' identifier to regular expression classes.  Added an &quot;isUnicode&quot;
+        parameter to YarrPattern pattern() and internal users of that function.
+
+        Updated the generation of the canonicalization tables to include a new set a tables that
+        follow the ES 6.0, 21.2.2.8.2 Step 2.  Renamed the YarrCanonicalizeUCS2.* files to
+        YarrCanonicalizeUnicode.*. 
+
+        Added a new Layout/js test that tests the added functionality.  Updated other tests that
+        have minor es6 unicode checks and look for valid flags.
+
+        Ran the ChakraCore Unicode regular expression tests as well.
+
+        * CMakeLists.txt:
+        * JavaScriptCore.vcxproj/JavaScriptCore.vcxproj:
+        * JavaScriptCore.vcxproj/JavaScriptCore.vcxproj.filters:
+        * JavaScriptCore.xcodeproj/project.pbxproj:
+
+        * inspector/ContentSearchUtilities.cpp:
+        (Inspector::ContentSearchUtilities::findMagicComment):
+        * yarr/RegularExpression.cpp:
+        (JSC::Yarr::RegularExpression::Private::compile):
+        Updated use of pattern().
+
+        * runtime/CommonIdentifiers.h:
+        * runtime/RegExp.cpp:
+        (JSC::regExpFlags):
+        (JSC::RegExpFunctionalTestCollector::outputOneTest):
+        (JSC::RegExp::finishCreation):
+        (JSC::RegExp::compile):
+        (JSC::RegExp::compileMatchOnly):
+        * runtime/RegExp.h:
+        * runtime/RegExpKey.h:
+        * runtime/RegExpPrototype.cpp:
+        (JSC::regExpProtoFuncCompile):
+        (JSC::flagsString):
+        (JSC::regExpProtoGetterMultiline):
+        (JSC::regExpProtoGetterUnicode):
+        (JSC::regExpProtoGetterFlags):
+        Updated for new 'y' (unicode) flag.  Add check to use the interpreter for unicode regular expressions.
+
+        * tests/es6.yaml:
+        * tests/stress/static-getter-in-names.js:
+        Updated tests for new flag and for passing the minimal es6 regular expression processing.
+
+        * yarr/Yarr.h: Updated the size of information now kept for backtracking.
+
+        * yarr/YarrCanonicalizeUCS2.cpp: Removed.
+        * yarr/YarrCanonicalizeUCS2.h: Removed.
+        * yarr/YarrCanonicalizeUCS2.js: Removed.
+        * yarr/YarrCanonicalizeUnicode.cpp: Copied from Source/JavaScriptCore/yarr/YarrCanonicalizeUCS2.cpp.
+        * yarr/YarrCanonicalizeUnicode.h: Copied from Source/JavaScriptCore/yarr/YarrCanonicalizeUCS2.h.
+        (JSC::Yarr::canonicalCharacterSetInfo):
+        (JSC::Yarr::canonicalRangeInfoFor):
+        (JSC::Yarr::getCanonicalPair):
+        (JSC::Yarr::isCanonicallyUnique):
+        (JSC::Yarr::areCanonicallyEquivalent):
+        (JSC::Yarr::rangeInfoFor): Deleted.
+        * yarr/YarrCanonicalizeUnicode.js: Copied from Source/JavaScriptCore/yarr/YarrCanonicalizeUCS2.js.
+        (printHeader):
+        (printFooter):
+        (hex):
+        (canonicalize):
+        (canonicalizeUnicode):
+        (createUCS2CanonicalGroups):
+        (createUnicodeCanonicalGroups):
+        (cu.in.groupedCanonically.characters.sort): Deleted.
+        (cu.in.groupedCanonically.else): Deleted.
+        Refactored to output two sets of tables, one for UCS2 and one for Unicode.  The UCS2 tables follow
+        the legacy canonicalization rules now specified in ES 6.0, 21.2.2.8.2 Step 3.  The new Unicode
+        tables follow the rules specified in ES 6.0, 21.2.2.8.2 Step 2.  Eliminated the unused Latin1 tables.
+
+        * yarr/YarrInterpreter.cpp:
+        (JSC::Yarr::Interpreter::InputStream::InputStream):
+        (JSC::Yarr::Interpreter::InputStream::readChecked):
+        (JSC::Yarr::Interpreter::InputStream::readSurrogatePairChecked):
+        (JSC::Yarr::Interpreter::InputStream::reread):
+        (JSC::Yarr::Interpreter::InputStream::prev):
+        (JSC::Yarr::Interpreter::testCharacterClass):
+        (JSC::Yarr::Interpreter::checkCharacter):
+        (JSC::Yarr::Interpreter::checkSurrogatePair):
+        (JSC::Yarr::Interpreter::checkCasedCharacter):
+        (JSC::Yarr::Interpreter::tryConsumeBackReference):
+        (JSC::Yarr::Interpreter::backtrackPatternCharacter):
+        (JSC::Yarr::Interpreter::matchCharacterClass):
+        (JSC::Yarr::Interpreter::backtrackCharacterClass):
+        (JSC::Yarr::Interpreter::matchParenthesesTerminalEnd):
+        (JSC::Yarr::Interpreter::matchDisjunction):
+        (JSC::Yarr::Interpreter::Interpreter):
+        (JSC::Yarr::ByteCompiler::assertionWordBoundary):
+        (JSC::Yarr::ByteCompiler::atomPatternCharacter):
+        * yarr/YarrInterpreter.h:
+        (JSC::Yarr::ByteTerm::ByteTerm):
+        (JSC::Yarr::BytecodePattern::BytecodePattern):
+        * yarr/YarrJIT.cpp:
+        (JSC::Yarr::YarrGenerator::optimizeAlternative):
+        (JSC::Yarr::YarrGenerator::matchCharacterClassRange):
+        (JSC::Yarr::YarrGenerator::matchCharacterClass):
+        (JSC::Yarr::YarrGenerator::notAtEndOfInput):
+        (JSC::Yarr::YarrGenerator::jumpIfCharNotEquals):
+        (JSC::Yarr::YarrGenerator::generatePatternCharacterOnce):
+        (JSC::Yarr::YarrGenerator::generatePatternCharacterFixed):
+        (JSC::Yarr::YarrGenerator::generatePatternCharacterGreedy):
+        (JSC::Yarr::YarrGenerator::backtrackPatternCharacterNonGreedy):
+        * yarr/YarrParser.h:
+        (JSC::Yarr::Parser::CharacterClassParserDelegate::atomPatternCharacter):
+        (JSC::Yarr::Parser::Parser):
+        (JSC::Yarr::Parser::parseEscape):
+        (JSC::Yarr::Parser::consumePossibleSurrogatePair):
+        (JSC::Yarr::Parser::parseCharacterClass):
+        (JSC::Yarr::Parser::parseTokens):
+        (JSC::Yarr::Parser::parse):
+        (JSC::Yarr::Parser::atEndOfPattern):
+        (JSC::Yarr::Parser::patternRemaining):
+        (JSC::Yarr::Parser::peek):
+        (JSC::Yarr::parse):
+        * yarr/YarrPattern.cpp:
+        (JSC::Yarr::CharacterClassConstructor::CharacterClassConstructor):
+        (JSC::Yarr::CharacterClassConstructor::append):
+        (JSC::Yarr::CharacterClassConstructor::putChar):
+        (JSC::Yarr::CharacterClassConstructor::putUnicodeIgnoreCase):
+        (JSC::Yarr::CharacterClassConstructor::putRange):
+        (JSC::Yarr::CharacterClassConstructor::charClass):
+        (JSC::Yarr::CharacterClassConstructor::addSorted):
+        (JSC::Yarr::CharacterClassConstructor::addSortedRange):
+        (JSC::Yarr::YarrPatternConstructor::YarrPatternConstructor):
+        (JSC::Yarr::YarrPatternConstructor::assertionWordBoundary):
+        (JSC::Yarr::YarrPatternConstructor::atomPatternCharacter):
+        (JSC::Yarr::YarrPatternConstructor::atomCharacterClassBegin):
+        (JSC::Yarr::YarrPatternConstructor::atomCharacterClassAtom):
+        (JSC::Yarr::YarrPatternConstructor::atomCharacterClassRange):
+        (JSC::Yarr::YarrPatternConstructor::setupAlternativeOffsets):
+        (JSC::Yarr::YarrPattern::compile):
+        (JSC::Yarr::YarrPattern::YarrPattern):
+        * yarr/YarrPattern.h:
+        (JSC::Yarr::CharacterRange::CharacterRange):
+        (JSC::Yarr::CharacterClass::CharacterClass):
+        (JSC::Yarr::PatternTerm::PatternTerm):
+        (JSC::Yarr::YarrPattern::reset):
+        * yarr/YarrSyntaxChecker.cpp:
+        (JSC::Yarr::SyntaxChecker::assertionBOL):
+        (JSC::Yarr::SyntaxChecker::assertionEOL):
+        (JSC::Yarr::SyntaxChecker::assertionWordBoundary):
+        (JSC::Yarr::SyntaxChecker::atomPatternCharacter):
+        (JSC::Yarr::SyntaxChecker::atomBuiltInCharacterClass):
+        (JSC::Yarr::SyntaxChecker::atomCharacterClassBegin):
+        (JSC::Yarr::SyntaxChecker::atomCharacterClassAtom):
+        (JSC::Yarr::checkSyntax):
+
</ins><span class="cx"> 2016-03-01  Saam barati  &lt;sbarati@apple.com&gt;
</span><span class="cx"> 
</span><span class="cx">         Remove FIXMEs and add valid test cases after necessary patch has landed.
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreJavaScriptCorevcxprojJavaScriptCorevcxproj"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/JavaScriptCore.vcxproj/JavaScriptCore.vcxproj (197425 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/JavaScriptCore.vcxproj/JavaScriptCore.vcxproj        2016-03-02 00:31:47 UTC (rev 197425)
+++ trunk/Source/JavaScriptCore/JavaScriptCore.vcxproj/JavaScriptCore.vcxproj        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -937,7 +937,7 @@
</span><span class="cx">     &lt;ClCompile Include=&quot;..\wasm\WASMModuleParser.cpp&quot; /&gt;
</span><span class="cx">     &lt;ClCompile Include=&quot;..\wasm\WASMReader.cpp&quot; /&gt;
</span><span class="cx">     &lt;ClCompile Include=&quot;..\yarr\RegularExpression.cpp&quot; /&gt;
</span><del>-    &lt;ClCompile Include=&quot;..\yarr\YarrCanonicalizeUCS2.cpp&quot; /&gt;
</del><ins>+    &lt;ClCompile Include=&quot;..\yarr\YarrCanonicalizeUnicode.cpp&quot; /&gt;
</ins><span class="cx">     &lt;ClCompile Include=&quot;..\yarr\YarrInterpreter.cpp&quot; /&gt;
</span><span class="cx">     &lt;ClCompile Include=&quot;..\yarr\YarrJIT.cpp&quot; /&gt;
</span><span class="cx">     &lt;ClCompile Include=&quot;..\yarr\YarrPattern.cpp&quot; /&gt;
</span><span class="lines">@@ -1878,7 +1878,7 @@
</span><span class="cx">     &lt;ClInclude Include=&quot;..\wasm\WASMReader.h&quot; /&gt;
</span><span class="cx">     &lt;ClInclude Include=&quot;..\yarr\RegularExpression.h&quot; /&gt;
</span><span class="cx">     &lt;ClInclude Include=&quot;..\yarr\Yarr.h&quot; /&gt;
</span><del>-    &lt;ClInclude Include=&quot;..\yarr\YarrCanonicalizeUCS2.h&quot; /&gt;
</del><ins>+    &lt;ClInclude Include=&quot;..\yarr\YarrCanonicalizeUnicode.h&quot; /&gt;
</ins><span class="cx">     &lt;ClInclude Include=&quot;..\yarr\YarrInterpreter.h&quot; /&gt;
</span><span class="cx">     &lt;ClInclude Include=&quot;..\yarr\YarrJIT.h&quot; /&gt;
</span><span class="cx">     &lt;ClInclude Include=&quot;..\yarr\YarrParser.h&quot; /&gt;
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreJavaScriptCorevcxprojJavaScriptCorevcxprojfilters"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/JavaScriptCore.vcxproj/JavaScriptCore.vcxproj.filters (197425 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/JavaScriptCore.vcxproj/JavaScriptCore.vcxproj.filters        2016-03-02 00:31:47 UTC (rev 197425)
+++ trunk/Source/JavaScriptCore/JavaScriptCore.vcxproj/JavaScriptCore.vcxproj.filters        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -1059,7 +1059,7 @@
</span><span class="cx">     &lt;ClCompile Include=&quot;..\yarr\RegularExpression.cpp&quot;&gt;
</span><span class="cx">       &lt;Filter&gt;yarr&lt;/Filter&gt;
</span><span class="cx">     &lt;/ClCompile&gt;
</span><del>-    &lt;ClCompile Include=&quot;..\yarr\YarrCanonicalizeUCS2.cpp&quot;&gt;
</del><ins>+    &lt;ClCompile Include=&quot;..\yarr\YarrCanonicalizeUnicode.cpp&quot;&gt;
</ins><span class="cx">       &lt;Filter&gt;yarr&lt;/Filter&gt;
</span><span class="cx">     &lt;/ClCompile&gt;
</span><span class="cx">     &lt;ClCompile Include=&quot;..\yarr\YarrInterpreter.cpp&quot;&gt;
</span><span class="lines">@@ -3316,7 +3316,7 @@
</span><span class="cx">     &lt;ClInclude Include=&quot;..\yarr\RegularExpression.h&quot;&gt;
</span><span class="cx">       &lt;Filter&gt;yarr&lt;/Filter&gt;
</span><span class="cx">     &lt;/ClInclude&gt;
</span><del>-    &lt;ClInclude Include=&quot;..\yarr\YarrCanonicalizeUCS2.h&quot;&gt;
</del><ins>+    &lt;ClInclude Include=&quot;..\yarr\YarrCanonicalizeUnicode.h&quot;&gt;
</ins><span class="cx">       &lt;Filter&gt;yarr&lt;/Filter&gt;
</span><span class="cx">     &lt;/ClInclude&gt;
</span><span class="cx">     &lt;ClInclude Include=&quot;..\yarr\YarrInterpreter.h&quot;&gt;
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreJavaScriptCorexcodeprojprojectpbxproj"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj (197425 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj        2016-03-02 00:31:47 UTC (rev 197425)
+++ trunk/Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -1324,7 +1324,7 @@
</span><span class="cx">                 862553D116136DA9009F17D0 /* JSProxy.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 862553CE16136AA5009F17D0 /* JSProxy.cpp */; };
</span><span class="cx">                 862553D216136E1A009F17D0 /* JSProxy.h in Headers */ = {isa = PBXBuildFile; fileRef = 862553CF16136AA5009F17D0 /* JSProxy.h */; settings = {ATTRIBUTES = (Private, ); }; };
</span><span class="cx">                 863B23E00FC6118900703AA4 /* MacroAssemblerCodeRef.h in Headers */ = {isa = PBXBuildFile; fileRef = 863B23DF0FC60E6200703AA4 /* MacroAssemblerCodeRef.h */; settings = {ATTRIBUTES = (Private, ); }; };
</span><del>-                863C6D9C1521111A00585E4E /* YarrCanonicalizeUCS2.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 863C6D981521111200585E4E /* YarrCanonicalizeUCS2.cpp */; };
</del><ins>+                863C6D9C1521111A00585E4E /* YarrCanonicalizeUnicode.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 863C6D981521111200585E4E /* YarrCanonicalizeUnicode.cpp */; };
</ins><span class="cx">                 8642C510151C06A90046D4EF /* RegExpCachedResult.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 86F75EFB151C062F007C9BA3 /* RegExpCachedResult.cpp */; };
</span><span class="cx">                 8642C512151C083D0046D4EF /* RegExpMatchesArray.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 86F75EFD151C062F007C9BA3 /* RegExpMatchesArray.cpp */; };
</span><span class="cx">                 865A30F1135007E100CDB49E /* JSCJSValueInlines.h in Headers */ = {isa = PBXBuildFile; fileRef = 865A30F0135007E100CDB49E /* JSCJSValueInlines.h */; settings = {ATTRIBUTES = (Private, ); }; };
</span><span class="lines">@@ -3489,9 +3489,9 @@
</span><span class="cx">                 862553CE16136AA5009F17D0 /* JSProxy.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = JSProxy.cpp; sourceTree = &quot;&lt;group&gt;&quot;; };
</span><span class="cx">                 862553CF16136AA5009F17D0 /* JSProxy.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = JSProxy.h; sourceTree = &quot;&lt;group&gt;&quot;; };
</span><span class="cx">                 863B23DF0FC60E6200703AA4 /* MacroAssemblerCodeRef.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = MacroAssemblerCodeRef.h; sourceTree = &quot;&lt;group&gt;&quot;; };
</span><del>-                863C6D981521111200585E4E /* YarrCanonicalizeUCS2.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = YarrCanonicalizeUCS2.cpp; path = yarr/YarrCanonicalizeUCS2.cpp; sourceTree = &quot;&lt;group&gt;&quot;; };
-                863C6D991521111200585E4E /* YarrCanonicalizeUCS2.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = YarrCanonicalizeUCS2.h; path = yarr/YarrCanonicalizeUCS2.h; sourceTree = &quot;&lt;group&gt;&quot;; };
-                863C6D9A1521111200585E4E /* YarrCanonicalizeUCS2.js */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.javascript; name = YarrCanonicalizeUCS2.js; path = yarr/YarrCanonicalizeUCS2.js; sourceTree = &quot;&lt;group&gt;&quot;; };
</del><ins>+                863C6D981521111200585E4E /* YarrCanonicalizeUnicode.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = YarrCanonicalizeUnicode.cpp; path = yarr/YarrCanonicalizeUnicode.cpp; sourceTree = &quot;&lt;group&gt;&quot;; };
+                863C6D991521111200585E4E /* YarrCanonicalizeUnicode.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = YarrCanonicalizeUnicode.h; path = yarr/YarrCanonicalizeUnicode.h; sourceTree = &quot;&lt;group&gt;&quot;; };
+                863C6D9A1521111200585E4E /* YarrCanonicalizeUnicode.js */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.javascript; name = YarrCanonicalizeUnicode.js; path = yarr/YarrCanonicalizeUnicode.js; sourceTree = &quot;&lt;group&gt;&quot;; };
</ins><span class="cx">                 8640923B156EED3B00566CB2 /* ARM64Assembler.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = ARM64Assembler.h; sourceTree = &quot;&lt;group&gt;&quot;; };
</span><span class="cx">                 8640923C156EED3B00566CB2 /* MacroAssemblerARM64.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = MacroAssemblerARM64.h; sourceTree = &quot;&lt;group&gt;&quot;; };
</span><span class="cx">                 865A30F0135007E100CDB49E /* JSCJSValueInlines.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = JSCJSValueInlines.h; sourceTree = &quot;&lt;group&gt;&quot;; };
</span><span class="lines">@@ -5996,9 +5996,9 @@
</span><span class="cx">                                 A57D23EB1891B5540031C7FA /* RegularExpression.cpp */,
</span><span class="cx">                                 A57D23EC1891B5540031C7FA /* RegularExpression.h */,
</span><span class="cx">                                 451539B812DC994500EF7AC4 /* Yarr.h */,
</span><del>-                                863C6D981521111200585E4E /* YarrCanonicalizeUCS2.cpp */,
-                                863C6D991521111200585E4E /* YarrCanonicalizeUCS2.h */,
-                                863C6D9A1521111200585E4E /* YarrCanonicalizeUCS2.js */,
</del><ins>+                                863C6D981521111200585E4E /* YarrCanonicalizeUnicode.cpp */,
+                                863C6D991521111200585E4E /* YarrCanonicalizeUnicode.h */,
+                                863C6D9A1521111200585E4E /* YarrCanonicalizeUnicode.js */,
</ins><span class="cx">                                 86704B7D12DBA33700A9FE7B /* YarrInterpreter.cpp */,
</span><span class="cx">                                 86704B7E12DBA33700A9FE7B /* YarrInterpreter.h */,
</span><span class="cx">                                 86704B7F12DBA33700A9FE7B /* YarrJIT.cpp */,
</span><span class="lines">@@ -9309,7 +9309,7 @@
</span><span class="cx">                                 0FC8150B14043C0E00CFA603 /* WriteBarrierSupport.cpp in Sources */,
</span><span class="cx">                                 A7E5AB3A1799E4B200D2833D /* X86Disassembler.cpp in Sources */,
</span><span class="cx">                                 0F2BBD971C5FF3F50023EF23 /* B3Variable.cpp in Sources */,
</span><del>-                                863C6D9C1521111A00585E4E /* YarrCanonicalizeUCS2.cpp in Sources */,
</del><ins>+                                863C6D9C1521111A00585E4E /* YarrCanonicalizeUnicode.cpp in Sources */,
</ins><span class="cx">                                 86704B8412DBA33700A9FE7B /* YarrInterpreter.cpp in Sources */,
</span><span class="cx">                                 86704B8612DBA33700A9FE7B /* YarrJIT.cpp in Sources */,
</span><span class="cx">                                 86704B8912DBA33700A9FE7B /* YarrPattern.cpp in Sources */,
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreinspectorContentSearchUtilitiescpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/inspector/ContentSearchUtilities.cpp (197425 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/inspector/ContentSearchUtilities.cpp        2016-03-02 00:31:47 UTC (rev 197425)
+++ trunk/Source/JavaScriptCore/inspector/ContentSearchUtilities.cpp        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -176,7 +176,7 @@
</span><span class="cx"> {
</span><span class="cx">     ASSERT(!content.isNull());
</span><span class="cx">     const char* error = nullptr;
</span><del>-    JSC::Yarr::YarrPattern pattern(patternString, false, true, &amp;error);
</del><ins>+    JSC::Yarr::YarrPattern pattern(patternString, false, true, false, &amp;error);
</ins><span class="cx">     ASSERT(!error);
</span><span class="cx">     BumpPointerAllocator regexAllocator;
</span><span class="cx">     auto bytecodePattern = JSC::Yarr::byteCompile(pattern, &amp;regexAllocator);
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreruntimeCommonIdentifiersh"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/runtime/CommonIdentifiers.h (197425 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/runtime/CommonIdentifiers.h        2016-03-02 00:31:47 UTC (rev 197425)
+++ trunk/Source/JavaScriptCore/runtime/CommonIdentifiers.h        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -211,6 +211,7 @@
</span><span class="cx">     macro(toPrecision) \
</span><span class="cx">     macro(toString) \
</span><span class="cx">     macro(top) \
</span><ins>+    macro(unicode) \
</ins><span class="cx">     macro(usage) \
</span><span class="cx">     macro(value) \
</span><span class="cx">     macro(valueOf) \
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreruntimeRegExpcpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/runtime/RegExp.cpp (197425 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/runtime/RegExp.cpp        2016-03-02 00:31:47 UTC (rev 197425)
+++ trunk/Source/JavaScriptCore/runtime/RegExp.cpp        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -66,6 +66,12 @@
</span><span class="cx">             flags = static_cast&lt;RegExpFlags&gt;(flags | FlagMultiline);
</span><span class="cx">             break;
</span><span class="cx"> 
</span><ins>+        case 'u':
+            if (flags &amp; FlagUnicode)
+                return InvalidFlags;
+            flags = static_cast&lt;RegExpFlags&gt;(flags | FlagUnicode);
+            break;
+                
</ins><span class="cx">         default:
</span><span class="cx">             return InvalidFlags;
</span><span class="cx">         }
</span><span class="lines">@@ -126,6 +132,8 @@
</span><span class="cx">             fputc('i', m_file);
</span><span class="cx">         if (regExp-&gt;multiline())
</span><span class="cx">             fputc('m', m_file);
</span><ins>+        if (regExp-&gt;unicode())
+            fputc('u', m_file);
</ins><span class="cx">         fprintf(m_file, &quot;\n&quot;);
</span><span class="cx">     }
</span><span class="cx"> 
</span><span class="lines">@@ -240,7 +248,7 @@
</span><span class="cx"> void RegExp::finishCreation(VM&amp; vm)
</span><span class="cx"> {
</span><span class="cx">     Base::finishCreation(vm);
</span><del>-    Yarr::YarrPattern pattern(m_patternString, ignoreCase(), multiline(), &amp;m_constructionError);
</del><ins>+    Yarr::YarrPattern pattern(m_patternString, ignoreCase(), multiline(), unicode(), &amp;m_constructionError);
</ins><span class="cx">     if (m_constructionError)
</span><span class="cx">         m_state = ParseError;
</span><span class="cx">     else
</span><span class="lines">@@ -280,7 +288,7 @@
</span><span class="cx"> 
</span><span class="cx"> void RegExp::compile(VM* vm, Yarr::YarrCharSize charSize)
</span><span class="cx"> {
</span><del>-    Yarr::YarrPattern pattern(m_patternString, ignoreCase(), multiline(), &amp;m_constructionError);
</del><ins>+    Yarr::YarrPattern pattern(m_patternString, ignoreCase(), multiline(), unicode(), &amp;m_constructionError);
</ins><span class="cx">     if (m_constructionError) {
</span><span class="cx">         RELEASE_ASSERT_NOT_REACHED();
</span><span class="cx"> #if COMPILER_QUIRK(CONSIDERS_UNREACHABLE_CODE)
</span><span class="lines">@@ -297,7 +305,7 @@
</span><span class="cx">     }
</span><span class="cx"> 
</span><span class="cx"> #if ENABLE(YARR_JIT)
</span><del>-    if (!pattern.m_containsBackreferences &amp;&amp; !pattern.containsUnsignedLengthPattern() &amp;&amp; vm-&gt;canUseRegExpJIT()) {
</del><ins>+    if (!pattern.m_containsBackreferences &amp;&amp; !pattern.containsUnsignedLengthPattern() &amp;&amp; !unicode() &amp;&amp; vm-&gt;canUseRegExpJIT()) {
</ins><span class="cx">         Yarr::jitCompile(pattern, charSize, vm, m_regExpJITCode);
</span><span class="cx">         if (!m_regExpJITCode.isFallBack()) {
</span><span class="cx">             m_state = JITCode;
</span><span class="lines">@@ -399,7 +407,7 @@
</span><span class="cx"> 
</span><span class="cx"> void RegExp::compileMatchOnly(VM* vm, Yarr::YarrCharSize charSize)
</span><span class="cx"> {
</span><del>-    Yarr::YarrPattern pattern(m_patternString, ignoreCase(), multiline(), &amp;m_constructionError);
</del><ins>+    Yarr::YarrPattern pattern(m_patternString, ignoreCase(), multiline(), unicode(), &amp;m_constructionError);
</ins><span class="cx">     if (m_constructionError) {
</span><span class="cx">         RELEASE_ASSERT_NOT_REACHED();
</span><span class="cx"> #if COMPILER_QUIRK(CONSIDERS_UNREACHABLE_CODE)
</span><span class="lines">@@ -416,7 +424,7 @@
</span><span class="cx">     }
</span><span class="cx"> 
</span><span class="cx"> #if ENABLE(YARR_JIT)
</span><del>-    if (!pattern.m_containsBackreferences &amp;&amp; !pattern.containsUnsignedLengthPattern() &amp;&amp; vm-&gt;canUseRegExpJIT()) {
</del><ins>+    if (!pattern.m_containsBackreferences &amp;&amp; !pattern.containsUnsignedLengthPattern() &amp;&amp; !unicode() &amp;&amp; vm-&gt;canUseRegExpJIT()) {
</ins><span class="cx">         Yarr::jitCompile(pattern, charSize, vm, m_regExpJITCode, Yarr::MatchOnly);
</span><span class="cx">         if (!m_regExpJITCode.isFallBack()) {
</span><span class="cx">             m_state = JITCode;
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreruntimeRegExph"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/runtime/RegExp.h (197425 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/runtime/RegExp.h        2016-03-02 00:31:47 UTC (rev 197425)
+++ trunk/Source/JavaScriptCore/runtime/RegExp.h        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -55,6 +55,7 @@
</span><span class="cx">     bool global() const { return m_flags &amp; FlagGlobal; }
</span><span class="cx">     bool ignoreCase() const { return m_flags &amp; FlagIgnoreCase; }
</span><span class="cx">     bool multiline() const { return m_flags &amp; FlagMultiline; }
</span><ins>+    bool unicode() const { return m_flags &amp; FlagUnicode; }
</ins><span class="cx"> 
</span><span class="cx">     const String&amp; pattern() const { return m_patternString; }
</span><span class="cx"> 
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreruntimeRegExpKeyh"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/runtime/RegExpKey.h (197425 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/runtime/RegExpKey.h        2016-03-02 00:31:47 UTC (rev 197425)
+++ trunk/Source/JavaScriptCore/runtime/RegExpKey.h        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -38,7 +38,8 @@
</span><span class="cx">     FlagGlobal = 1,
</span><span class="cx">     FlagIgnoreCase = 2,
</span><span class="cx">     FlagMultiline = 4,
</span><del>-    InvalidFlags = 8,
</del><ins>+    FlagUnicode = 8,
+    InvalidFlags = 16,
</ins><span class="cx">     DeletedValueFlags = -1
</span><span class="cx"> };
</span><span class="cx"> 
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreruntimeRegExpPrototypecpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/runtime/RegExpPrototype.cpp (197425 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/runtime/RegExpPrototype.cpp        2016-03-02 00:31:47 UTC (rev 197425)
+++ trunk/Source/JavaScriptCore/runtime/RegExpPrototype.cpp        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -48,6 +48,7 @@
</span><span class="cx"> static EncodedJSValue JSC_HOST_CALL regExpProtoGetterGlobal(ExecState*);
</span><span class="cx"> static EncodedJSValue JSC_HOST_CALL regExpProtoGetterIgnoreCase(ExecState*);
</span><span class="cx"> static EncodedJSValue JSC_HOST_CALL regExpProtoGetterMultiline(ExecState*);
</span><ins>+static EncodedJSValue JSC_HOST_CALL regExpProtoGetterUnicode(ExecState*);
</ins><span class="cx"> static EncodedJSValue JSC_HOST_CALL regExpProtoGetterSource(ExecState*);
</span><span class="cx"> static EncodedJSValue JSC_HOST_CALL regExpProtoGetterFlags(ExecState*);
</span><span class="cx"> 
</span><span class="lines">@@ -68,6 +69,7 @@
</span><span class="cx">   global        regExpProtoGetterGlobal     DontEnum|Accessor
</span><span class="cx">   ignoreCase    regExpProtoGetterIgnoreCase DontEnum|Accessor
</span><span class="cx">   multiline     regExpProtoGetterMultiline  DontEnum|Accessor
</span><ins>+  unicode       regExpProtoGetterUnicode    DontEnum|Accessor
</ins><span class="cx">   source        regExpProtoGetterSource     DontEnum|Accessor
</span><span class="cx">   flags         regExpProtoGetterFlags      DontEnum|Accessor
</span><span class="cx"> @end
</span><span class="lines">@@ -146,7 +148,7 @@
</span><span class="cx">     return JSValue::encode(jsUndefined());
</span><span class="cx"> }
</span><span class="cx"> 
</span><del>-typedef std::array&lt;char, 3 + 1&gt; FlagsString; // 3 different flags and a null character terminator.
</del><ins>+typedef std::array&lt;char, 4 + 1&gt; FlagsString; // 4 different flags and a null character terminator.
</ins><span class="cx"> 
</span><span class="cx"> static inline FlagsString flagsString(ExecState* exec, JSObject* regexp)
</span><span class="cx"> {
</span><span class="lines">@@ -159,6 +161,9 @@
</span><span class="cx">     if (exec-&gt;hadException())
</span><span class="cx">         return string;
</span><span class="cx">     JSValue multilineValue = regexp-&gt;get(exec, exec-&gt;propertyNames().multiline);
</span><ins>+    if (exec-&gt;hadException())
+        return string;
+    JSValue unicodeValue = regexp-&gt;get(exec, exec-&gt;propertyNames().unicode);
</ins><span class="cx"> 
</span><span class="cx">     unsigned index = 0;
</span><span class="cx">     if (globalValue.toBoolean(exec))
</span><span class="lines">@@ -167,6 +172,8 @@
</span><span class="cx">         string[index++] = 'i';
</span><span class="cx">     if (multilineValue.toBoolean(exec))
</span><span class="cx">         string[index++] = 'm';
</span><ins>+    if (unicodeValue.toBoolean(exec))
+        string[index++] = 'u';
</ins><span class="cx">     ASSERT(index &lt; string.size());
</span><span class="cx">     string[index] = 0;
</span><span class="cx">     return string;
</span><span class="lines">@@ -225,6 +232,15 @@
</span><span class="cx">     return JSValue::encode(jsBoolean(asRegExpObject(thisValue)-&gt;regExp()-&gt;multiline()));
</span><span class="cx"> }
</span><span class="cx"> 
</span><ins>+EncodedJSValue JSC_HOST_CALL regExpProtoGetterUnicode(ExecState* exec)
+{
+    JSValue thisValue = exec-&gt;thisValue();
+    if (!thisValue.inherits(RegExpObject::info()))
+        return throwVMTypeError(exec);
+    
+    return JSValue::encode(jsBoolean(asRegExpObject(thisValue)-&gt;regExp()-&gt;unicode()));
+}
+
</ins><span class="cx"> EncodedJSValue JSC_HOST_CALL regExpProtoGetterFlags(ExecState* exec)
</span><span class="cx"> {
</span><span class="cx">     JSValue thisValue = exec-&gt;thisValue();
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoretestses6yaml"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/tests/es6.yaml (197425 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/tests/es6.yaml        2016-03-02 00:31:47 UTC (rev 197425)
+++ trunk/Source/JavaScriptCore/tests/es6.yaml        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -1091,9 +1091,9 @@
</span><span class="cx"> - path: es6/RegExp_is_subclassable_correct_prototype_chain.js
</span><span class="cx">   cmd: runES6 :normal
</span><span class="cx"> - path: es6/RegExp_y_and_u_flags_u_flag.js
</span><del>-  cmd: runES6 :fail
</del><ins>+  cmd: runES6 :normal
</ins><span class="cx"> - path: es6/RegExp_y_and_u_flags_u_flag_Unicode_code_point_escapes.js
</span><del>-  cmd: runES6 :fail
</del><ins>+  cmd: runES6 :normal
</ins><span class="cx"> - path: es6/RegExp_y_and_u_flags_y_flag.js
</span><span class="cx">   cmd: runES6 :fail
</span><span class="cx"> - path: es6/RegExp_y_and_u_flags_y_flag_lastIndex.js
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoretestsstressstaticgetterinnamesjs"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/tests/stress/static-getter-in-names.js (197425 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/tests/stress/static-getter-in-names.js        2016-03-02 00:31:47 UTC (rev 197425)
+++ trunk/Source/JavaScriptCore/tests/stress/static-getter-in-names.js        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -3,5 +3,5 @@
</span><span class="cx">         throw new Error('bad value: ' + actual);
</span><span class="cx"> }
</span><span class="cx"> 
</span><del>-shouldBe(JSON.stringify(Object.getOwnPropertyNames(RegExp.prototype).sort()), '[&quot;compile&quot;,&quot;constructor&quot;,&quot;exec&quot;,&quot;flags&quot;,&quot;global&quot;,&quot;ignoreCase&quot;,&quot;lastIndex&quot;,&quot;multiline&quot;,&quot;source&quot;,&quot;test&quot;,&quot;toString&quot;]');
</del><ins>+shouldBe(JSON.stringify(Object.getOwnPropertyNames(RegExp.prototype).sort()), '[&quot;compile&quot;,&quot;constructor&quot;,&quot;exec&quot;,&quot;flags&quot;,&quot;global&quot;,&quot;ignoreCase&quot;,&quot;lastIndex&quot;,&quot;multiline&quot;,&quot;source&quot;,&quot;test&quot;,&quot;toString&quot;,&quot;unicode&quot;]');
</ins><span class="cx"> shouldBe(JSON.stringify(Object.getOwnPropertyNames(/Cocoa/).sort()), '[&quot;lastIndex&quot;]');
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreyarrRegularExpressioncpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/yarr/RegularExpression.cpp (197425 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/yarr/RegularExpression.cpp        2016-03-02 00:31:47 UTC (rev 197425)
+++ trunk/Source/JavaScriptCore/yarr/RegularExpression.cpp        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -57,7 +57,7 @@
</span><span class="cx"> 
</span><span class="cx">     std::unique_ptr&lt;JSC::Yarr::BytecodePattern&gt; compile(const String&amp; patternString, TextCaseSensitivity caseSensitivity, MultilineMode multilineMode)
</span><span class="cx">     {
</span><del>-        JSC::Yarr::YarrPattern pattern(patternString, (caseSensitivity == TextCaseInsensitive), (multilineMode == MultilineEnabled), &amp;m_constructionError);
</del><ins>+        JSC::Yarr::YarrPattern pattern(patternString, (caseSensitivity == TextCaseInsensitive), (multilineMode == MultilineEnabled), false, &amp;m_constructionError);
</ins><span class="cx">         if (m_constructionError) {
</span><span class="cx">             LOG_ERROR(&quot;RegularExpression: YARR compile failed with '%s'&quot;, m_constructionError);
</span><span class="cx">             return nullptr;
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreyarrYarrh"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/yarr/Yarr.h (197425 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/yarr/Yarr.h        2016-03-02 00:31:47 UTC (rev 197425)
+++ trunk/Source/JavaScriptCore/yarr/Yarr.h        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -33,8 +33,8 @@
</span><span class="cx"> 
</span><span class="cx"> namespace JSC { namespace Yarr {
</span><span class="cx"> 
</span><del>-#define YarrStackSpaceForBackTrackInfoPatternCharacter 1 // Only for !fixed quantifiers.
-#define YarrStackSpaceForBackTrackInfoCharacterClass 1 // Only for !fixed quantifiers.
</del><ins>+#define YarrStackSpaceForBackTrackInfoPatternCharacter 2 // Only for !fixed quantifiers.
+#define YarrStackSpaceForBackTrackInfoCharacterClass 2 // Only for !fixed quantifiers.
</ins><span class="cx"> #define YarrStackSpaceForBackTrackInfoBackReference 2
</span><span class="cx"> #define YarrStackSpaceForBackTrackInfoAlternative 1 // One per alternative.
</span><span class="cx"> #define YarrStackSpaceForBackTrackInfoParentheticalAssertion 1
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreyarrYarrCanonicalizeUCS2cpp"></a>
<div class="delfile"><h4>Deleted: trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUCS2.cpp (197425 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUCS2.cpp        2016-03-02 00:31:47 UTC (rev 197425)
+++ trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUCS2.cpp        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -1,463 +0,0 @@
</span><del>-/*
- * Copyright (C) 2012 Apple Inc. All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- * 1. Redistributions of source code must retain the above copyright
- *    notice, this list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright
- *    notice, this list of conditions and the following disclaimer in the
- *    documentation and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
- * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
- * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
- * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL APPLE INC. OR
- * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
- * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
- * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
- * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
- * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 
- */
-
-// DO NOT EDIT! - this file autogenerated by YarrCanonicalizeUCS2.js
-
-#include &quot;config.h&quot;
-#include &quot;YarrCanonicalizeUCS2.h&quot;
-
-namespace JSC { namespace Yarr {
-
-#include &lt;stdint.h&gt;
-
-const uint16_t ucs2CharacterSet0[] = { 0x01c4u, 0x01c5u, 0x01c6u, 0 };
-const uint16_t ucs2CharacterSet1[] = { 0x01c7u, 0x01c8u, 0x01c9u, 0 };
-const uint16_t ucs2CharacterSet2[] = { 0x01cau, 0x01cbu, 0x01ccu, 0 };
-const uint16_t ucs2CharacterSet3[] = { 0x01f1u, 0x01f2u, 0x01f3u, 0 };
-const uint16_t ucs2CharacterSet4[] = { 0x0392u, 0x03b2u, 0x03d0u, 0 };
-const uint16_t ucs2CharacterSet5[] = { 0x0395u, 0x03b5u, 0x03f5u, 0 };
-const uint16_t ucs2CharacterSet6[] = { 0x0398u, 0x03b8u, 0x03d1u, 0 };
-const uint16_t ucs2CharacterSet7[] = { 0x0345u, 0x0399u, 0x03b9u, 0x1fbeu, 0 };
-const uint16_t ucs2CharacterSet8[] = { 0x039au, 0x03bau, 0x03f0u, 0 };
-const uint16_t ucs2CharacterSet9[] = { 0x00b5u, 0x039cu, 0x03bcu, 0 };
-const uint16_t ucs2CharacterSet10[] = { 0x03a0u, 0x03c0u, 0x03d6u, 0 };
-const uint16_t ucs2CharacterSet11[] = { 0x03a1u, 0x03c1u, 0x03f1u, 0 };
-const uint16_t ucs2CharacterSet12[] = { 0x03a3u, 0x03c2u, 0x03c3u, 0 };
-const uint16_t ucs2CharacterSet13[] = { 0x03a6u, 0x03c6u, 0x03d5u, 0 };
-const uint16_t ucs2CharacterSet14[] = { 0x1e60u, 0x1e61u, 0x1e9bu, 0 };
-
-static const size_t UCS2_CANONICALIZATION_SETS = 15;
-const uint16_t* const characterSetInfo[UCS2_CANONICALIZATION_SETS] = {
-    ucs2CharacterSet0,
-    ucs2CharacterSet1,
-    ucs2CharacterSet2,
-    ucs2CharacterSet3,
-    ucs2CharacterSet4,
-    ucs2CharacterSet5,
-    ucs2CharacterSet6,
-    ucs2CharacterSet7,
-    ucs2CharacterSet8,
-    ucs2CharacterSet9,
-    ucs2CharacterSet10,
-    ucs2CharacterSet11,
-    ucs2CharacterSet12,
-    ucs2CharacterSet13,
-    ucs2CharacterSet14,
-};
-
-const size_t UCS2_CANONICALIZATION_RANGES = 364;
-const UCS2CanonicalizationRange rangeInfo[UCS2_CANONICALIZATION_RANGES] = {
-    { 0x0000u, 0x0040u, 0x0000u, CanonicalizeUnique },
-    { 0x0041u, 0x005au, 0x0020u, CanonicalizeRangeLo },
-    { 0x005bu, 0x0060u, 0x0000u, CanonicalizeUnique },
-    { 0x0061u, 0x007au, 0x0020u, CanonicalizeRangeHi },
-    { 0x007bu, 0x00b4u, 0x0000u, CanonicalizeUnique },
-    { 0x00b5u, 0x00b5u, 0x0009u, CanonicalizeSet },
-    { 0x00b6u, 0x00bfu, 0x0000u, CanonicalizeUnique },
-    { 0x00c0u, 0x00d6u, 0x0020u, CanonicalizeRangeLo },
-    { 0x00d7u, 0x00d7u, 0x0000u, CanonicalizeUnique },
-    { 0x00d8u, 0x00deu, 0x0020u, CanonicalizeRangeLo },
-    { 0x00dfu, 0x00dfu, 0x0000u, CanonicalizeUnique },
-    { 0x00e0u, 0x00f6u, 0x0020u, CanonicalizeRangeHi },
-    { 0x00f7u, 0x00f7u, 0x0000u, CanonicalizeUnique },
-    { 0x00f8u, 0x00feu, 0x0020u, CanonicalizeRangeHi },
-    { 0x00ffu, 0x00ffu, 0x0079u, CanonicalizeRangeLo },
-    { 0x0100u, 0x012fu, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0x0130u, 0x0131u, 0x0000u, CanonicalizeUnique },
-    { 0x0132u, 0x0137u, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0x0138u, 0x0138u, 0x0000u, CanonicalizeUnique },
-    { 0x0139u, 0x0148u, 0x0000u, CanonicalizeAlternatingUnaligned },
-    { 0x0149u, 0x0149u, 0x0000u, CanonicalizeUnique },
-    { 0x014au, 0x0177u, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0x0178u, 0x0178u, 0x0079u, CanonicalizeRangeHi },
-    { 0x0179u, 0x017eu, 0x0000u, CanonicalizeAlternatingUnaligned },
-    { 0x017fu, 0x017fu, 0x0000u, CanonicalizeUnique },
-    { 0x0180u, 0x0180u, 0x00c3u, CanonicalizeRangeLo },
-    { 0x0181u, 0x0181u, 0x00d2u, CanonicalizeRangeLo },
-    { 0x0182u, 0x0185u, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0x0186u, 0x0186u, 0x00ceu, CanonicalizeRangeLo },
-    { 0x0187u, 0x0188u, 0x0000u, CanonicalizeAlternatingUnaligned },
-    { 0x0189u, 0x018au, 0x00cdu, CanonicalizeRangeLo },
-    { 0x018bu, 0x018cu, 0x0000u, CanonicalizeAlternatingUnaligned },
-    { 0x018du, 0x018du, 0x0000u, CanonicalizeUnique },
-    { 0x018eu, 0x018eu, 0x004fu, CanonicalizeRangeLo },
-    { 0x018fu, 0x018fu, 0x00cau, CanonicalizeRangeLo },
-    { 0x0190u, 0x0190u, 0x00cbu, CanonicalizeRangeLo },
-    { 0x0191u, 0x0192u, 0x0000u, CanonicalizeAlternatingUnaligned },
-    { 0x0193u, 0x0193u, 0x00cdu, CanonicalizeRangeLo },
-    { 0x0194u, 0x0194u, 0x00cfu, CanonicalizeRangeLo },
-    { 0x0195u, 0x0195u, 0x0061u, CanonicalizeRangeLo },
-    { 0x0196u, 0x0196u, 0x00d3u, CanonicalizeRangeLo },
-    { 0x0197u, 0x0197u, 0x00d1u, CanonicalizeRangeLo },
-    { 0x0198u, 0x0199u, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0x019au, 0x019au, 0x00a3u, CanonicalizeRangeLo },
-    { 0x019bu, 0x019bu, 0x0000u, CanonicalizeUnique },
-    { 0x019cu, 0x019cu, 0x00d3u, CanonicalizeRangeLo },
-    { 0x019du, 0x019du, 0x00d5u, CanonicalizeRangeLo },
-    { 0x019eu, 0x019eu, 0x0082u, CanonicalizeRangeLo },
-    { 0x019fu, 0x019fu, 0x00d6u, CanonicalizeRangeLo },
-    { 0x01a0u, 0x01a5u, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0x01a6u, 0x01a6u, 0x00dau, CanonicalizeRangeLo },
-    { 0x01a7u, 0x01a8u, 0x0000u, CanonicalizeAlternatingUnaligned },
-    { 0x01a9u, 0x01a9u, 0x00dau, CanonicalizeRangeLo },
-    { 0x01aau, 0x01abu, 0x0000u, CanonicalizeUnique },
-    { 0x01acu, 0x01adu, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0x01aeu, 0x01aeu, 0x00dau, CanonicalizeRangeLo },
-    { 0x01afu, 0x01b0u, 0x0000u, CanonicalizeAlternatingUnaligned },
-    { 0x01b1u, 0x01b2u, 0x00d9u, CanonicalizeRangeLo },
-    { 0x01b3u, 0x01b6u, 0x0000u, CanonicalizeAlternatingUnaligned },
-    { 0x01b7u, 0x01b7u, 0x00dbu, CanonicalizeRangeLo },
-    { 0x01b8u, 0x01b9u, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0x01bau, 0x01bbu, 0x0000u, CanonicalizeUnique },
-    { 0x01bcu, 0x01bdu, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0x01beu, 0x01beu, 0x0000u, CanonicalizeUnique },
-    { 0x01bfu, 0x01bfu, 0x0038u, CanonicalizeRangeLo },
-    { 0x01c0u, 0x01c3u, 0x0000u, CanonicalizeUnique },
-    { 0x01c4u, 0x01c6u, 0x0000u, CanonicalizeSet },
-    { 0x01c7u, 0x01c9u, 0x0001u, CanonicalizeSet },
-    { 0x01cau, 0x01ccu, 0x0002u, CanonicalizeSet },
-    { 0x01cdu, 0x01dcu, 0x0000u, CanonicalizeAlternatingUnaligned },
-    { 0x01ddu, 0x01ddu, 0x004fu, CanonicalizeRangeHi },
-    { 0x01deu, 0x01efu, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0x01f0u, 0x01f0u, 0x0000u, CanonicalizeUnique },
-    { 0x01f1u, 0x01f3u, 0x0003u, CanonicalizeSet },
-    { 0x01f4u, 0x01f5u, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0x01f6u, 0x01f6u, 0x0061u, CanonicalizeRangeHi },
-    { 0x01f7u, 0x01f7u, 0x0038u, CanonicalizeRangeHi },
-    { 0x01f8u, 0x021fu, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0x0220u, 0x0220u, 0x0082u, CanonicalizeRangeHi },
-    { 0x0221u, 0x0221u, 0x0000u, CanonicalizeUnique },
-    { 0x0222u, 0x0233u, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0x0234u, 0x0239u, 0x0000u, CanonicalizeUnique },
-    { 0x023au, 0x023au, 0x2a2bu, CanonicalizeRangeLo },
-    { 0x023bu, 0x023cu, 0x0000u, CanonicalizeAlternatingUnaligned },
-    { 0x023du, 0x023du, 0x00a3u, CanonicalizeRangeHi },
-    { 0x023eu, 0x023eu, 0x2a28u, CanonicalizeRangeLo },
-    { 0x023fu, 0x0240u, 0x2a3fu, CanonicalizeRangeLo },
-    { 0x0241u, 0x0242u, 0x0000u, CanonicalizeAlternatingUnaligned },
-    { 0x0243u, 0x0243u, 0x00c3u, CanonicalizeRangeHi },
-    { 0x0244u, 0x0244u, 0x0045u, CanonicalizeRangeLo },
-    { 0x0245u, 0x0245u, 0x0047u, CanonicalizeRangeLo },
-    { 0x0246u, 0x024fu, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0x0250u, 0x0250u, 0x2a1fu, CanonicalizeRangeLo },
-    { 0x0251u, 0x0251u, 0x2a1cu, CanonicalizeRangeLo },
-    { 0x0252u, 0x0252u, 0x2a1eu, CanonicalizeRangeLo },
-    { 0x0253u, 0x0253u, 0x00d2u, CanonicalizeRangeHi },
-    { 0x0254u, 0x0254u, 0x00ceu, CanonicalizeRangeHi },
-    { 0x0255u, 0x0255u, 0x0000u, CanonicalizeUnique },
-    { 0x0256u, 0x0257u, 0x00cdu, CanonicalizeRangeHi },
-    { 0x0258u, 0x0258u, 0x0000u, CanonicalizeUnique },
-    { 0x0259u, 0x0259u, 0x00cau, CanonicalizeRangeHi },
-    { 0x025au, 0x025au, 0x0000u, CanonicalizeUnique },
-    { 0x025bu, 0x025bu, 0x00cbu, CanonicalizeRangeHi },
-    { 0x025cu, 0x025fu, 0x0000u, CanonicalizeUnique },
-    { 0x0260u, 0x0260u, 0x00cdu, CanonicalizeRangeHi },
-    { 0x0261u, 0x0262u, 0x0000u, CanonicalizeUnique },
-    { 0x0263u, 0x0263u, 0x00cfu, CanonicalizeRangeHi },
-    { 0x0264u, 0x0264u, 0x0000u, CanonicalizeUnique },
-    { 0x0265u, 0x0265u, 0xa528u, CanonicalizeRangeLo },
-    { 0x0266u, 0x0267u, 0x0000u, CanonicalizeUnique },
-    { 0x0268u, 0x0268u, 0x00d1u, CanonicalizeRangeHi },
-    { 0x0269u, 0x0269u, 0x00d3u, CanonicalizeRangeHi },
-    { 0x026au, 0x026au, 0x0000u, CanonicalizeUnique },
-    { 0x026bu, 0x026bu, 0x29f7u, CanonicalizeRangeLo },
-    { 0x026cu, 0x026eu, 0x0000u, CanonicalizeUnique },
-    { 0x026fu, 0x026fu, 0x00d3u, CanonicalizeRangeHi },
-    { 0x0270u, 0x0270u, 0x0000u, CanonicalizeUnique },
-    { 0x0271u, 0x0271u, 0x29fdu, CanonicalizeRangeLo },
-    { 0x0272u, 0x0272u, 0x00d5u, CanonicalizeRangeHi },
-    { 0x0273u, 0x0274u, 0x0000u, CanonicalizeUnique },
-    { 0x0275u, 0x0275u, 0x00d6u, CanonicalizeRangeHi },
-    { 0x0276u, 0x027cu, 0x0000u, CanonicalizeUnique },
-    { 0x027du, 0x027du, 0x29e7u, CanonicalizeRangeLo },
-    { 0x027eu, 0x027fu, 0x0000u, CanonicalizeUnique },
-    { 0x0280u, 0x0280u, 0x00dau, CanonicalizeRangeHi },
-    { 0x0281u, 0x0282u, 0x0000u, CanonicalizeUnique },
-    { 0x0283u, 0x0283u, 0x00dau, CanonicalizeRangeHi },
-    { 0x0284u, 0x0287u, 0x0000u, CanonicalizeUnique },
-    { 0x0288u, 0x0288u, 0x00dau, CanonicalizeRangeHi },
-    { 0x0289u, 0x0289u, 0x0045u, CanonicalizeRangeHi },
-    { 0x028au, 0x028bu, 0x00d9u, CanonicalizeRangeHi },
-    { 0x028cu, 0x028cu, 0x0047u, CanonicalizeRangeHi },
-    { 0x028du, 0x0291u, 0x0000u, CanonicalizeUnique },
-    { 0x0292u, 0x0292u, 0x00dbu, CanonicalizeRangeHi },
-    { 0x0293u, 0x0344u, 0x0000u, CanonicalizeUnique },
-    { 0x0345u, 0x0345u, 0x0007u, CanonicalizeSet },
-    { 0x0346u, 0x036fu, 0x0000u, CanonicalizeUnique },
-    { 0x0370u, 0x0373u, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0x0374u, 0x0375u, 0x0000u, CanonicalizeUnique },
-    { 0x0376u, 0x0377u, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0x0378u, 0x037au, 0x0000u, CanonicalizeUnique },
-    { 0x037bu, 0x037du, 0x0082u, CanonicalizeRangeLo },
-    { 0x037eu, 0x0385u, 0x0000u, CanonicalizeUnique },
-    { 0x0386u, 0x0386u, 0x0026u, CanonicalizeRangeLo },
-    { 0x0387u, 0x0387u, 0x0000u, CanonicalizeUnique },
-    { 0x0388u, 0x038au, 0x0025u, CanonicalizeRangeLo },
-    { 0x038bu, 0x038bu, 0x0000u, CanonicalizeUnique },
-    { 0x038cu, 0x038cu, 0x0040u, CanonicalizeRangeLo },
-    { 0x038du, 0x038du, 0x0000u, CanonicalizeUnique },
-    { 0x038eu, 0x038fu, 0x003fu, CanonicalizeRangeLo },
-    { 0x0390u, 0x0390u, 0x0000u, CanonicalizeUnique },
-    { 0x0391u, 0x0391u, 0x0020u, CanonicalizeRangeLo },
-    { 0x0392u, 0x0392u, 0x0004u, CanonicalizeSet },
-    { 0x0393u, 0x0394u, 0x0020u, CanonicalizeRangeLo },
-    { 0x0395u, 0x0395u, 0x0005u, CanonicalizeSet },
-    { 0x0396u, 0x0397u, 0x0020u, CanonicalizeRangeLo },
-    { 0x0398u, 0x0398u, 0x0006u, CanonicalizeSet },
-    { 0x0399u, 0x0399u, 0x0007u, CanonicalizeSet },
-    { 0x039au, 0x039au, 0x0008u, CanonicalizeSet },
-    { 0x039bu, 0x039bu, 0x0020u, CanonicalizeRangeLo },
-    { 0x039cu, 0x039cu, 0x0009u, CanonicalizeSet },
-    { 0x039du, 0x039fu, 0x0020u, CanonicalizeRangeLo },
-    { 0x03a0u, 0x03a0u, 0x000au, CanonicalizeSet },
-    { 0x03a1u, 0x03a1u, 0x000bu, CanonicalizeSet },
-    { 0x03a2u, 0x03a2u, 0x0000u, CanonicalizeUnique },
-    { 0x03a3u, 0x03a3u, 0x000cu, CanonicalizeSet },
-    { 0x03a4u, 0x03a5u, 0x0020u, CanonicalizeRangeLo },
-    { 0x03a6u, 0x03a6u, 0x000du, CanonicalizeSet },
-    { 0x03a7u, 0x03abu, 0x0020u, CanonicalizeRangeLo },
-    { 0x03acu, 0x03acu, 0x0026u, CanonicalizeRangeHi },
-    { 0x03adu, 0x03afu, 0x0025u, CanonicalizeRangeHi },
-    { 0x03b0u, 0x03b0u, 0x0000u, CanonicalizeUnique },
-    { 0x03b1u, 0x03b1u, 0x0020u, CanonicalizeRangeHi },
-    { 0x03b2u, 0x03b2u, 0x0004u, CanonicalizeSet },
-    { 0x03b3u, 0x03b4u, 0x0020u, CanonicalizeRangeHi },
-    { 0x03b5u, 0x03b5u, 0x0005u, CanonicalizeSet },
-    { 0x03b6u, 0x03b7u, 0x0020u, CanonicalizeRangeHi },
-    { 0x03b8u, 0x03b8u, 0x0006u, CanonicalizeSet },
-    { 0x03b9u, 0x03b9u, 0x0007u, CanonicalizeSet },
-    { 0x03bau, 0x03bau, 0x0008u, CanonicalizeSet },
-    { 0x03bbu, 0x03bbu, 0x0020u, CanonicalizeRangeHi },
-    { 0x03bcu, 0x03bcu, 0x0009u, CanonicalizeSet },
-    { 0x03bdu, 0x03bfu, 0x0020u, CanonicalizeRangeHi },
-    { 0x03c0u, 0x03c0u, 0x000au, CanonicalizeSet },
-    { 0x03c1u, 0x03c1u, 0x000bu, CanonicalizeSet },
-    { 0x03c2u, 0x03c3u, 0x000cu, CanonicalizeSet },
-    { 0x03c4u, 0x03c5u, 0x0020u, CanonicalizeRangeHi },
-    { 0x03c6u, 0x03c6u, 0x000du, CanonicalizeSet },
-    { 0x03c7u, 0x03cbu, 0x0020u, CanonicalizeRangeHi },
-    { 0x03ccu, 0x03ccu, 0x0040u, CanonicalizeRangeHi },
-    { 0x03cdu, 0x03ceu, 0x003fu, CanonicalizeRangeHi },
-    { 0x03cfu, 0x03cfu, 0x0008u, CanonicalizeRangeLo },
-    { 0x03d0u, 0x03d0u, 0x0004u, CanonicalizeSet },
-    { 0x03d1u, 0x03d1u, 0x0006u, CanonicalizeSet },
-    { 0x03d2u, 0x03d4u, 0x0000u, CanonicalizeUnique },
-    { 0x03d5u, 0x03d5u, 0x000du, CanonicalizeSet },
-    { 0x03d6u, 0x03d6u, 0x000au, CanonicalizeSet },
-    { 0x03d7u, 0x03d7u, 0x0008u, CanonicalizeRangeHi },
-    { 0x03d8u, 0x03efu, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0x03f0u, 0x03f0u, 0x0008u, CanonicalizeSet },
-    { 0x03f1u, 0x03f1u, 0x000bu, CanonicalizeSet },
-    { 0x03f2u, 0x03f2u, 0x0007u, CanonicalizeRangeLo },
-    { 0x03f3u, 0x03f4u, 0x0000u, CanonicalizeUnique },
-    { 0x03f5u, 0x03f5u, 0x0005u, CanonicalizeSet },
-    { 0x03f6u, 0x03f6u, 0x0000u, CanonicalizeUnique },
-    { 0x03f7u, 0x03f8u, 0x0000u, CanonicalizeAlternatingUnaligned },
-    { 0x03f9u, 0x03f9u, 0x0007u, CanonicalizeRangeHi },
-    { 0x03fau, 0x03fbu, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0x03fcu, 0x03fcu, 0x0000u, CanonicalizeUnique },
-    { 0x03fdu, 0x03ffu, 0x0082u, CanonicalizeRangeHi },
-    { 0x0400u, 0x040fu, 0x0050u, CanonicalizeRangeLo },
-    { 0x0410u, 0x042fu, 0x0020u, CanonicalizeRangeLo },
-    { 0x0430u, 0x044fu, 0x0020u, CanonicalizeRangeHi },
-    { 0x0450u, 0x045fu, 0x0050u, CanonicalizeRangeHi },
-    { 0x0460u, 0x0481u, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0x0482u, 0x0489u, 0x0000u, CanonicalizeUnique },
-    { 0x048au, 0x04bfu, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0x04c0u, 0x04c0u, 0x000fu, CanonicalizeRangeLo },
-    { 0x04c1u, 0x04ceu, 0x0000u, CanonicalizeAlternatingUnaligned },
-    { 0x04cfu, 0x04cfu, 0x000fu, CanonicalizeRangeHi },
-    { 0x04d0u, 0x0527u, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0x0528u, 0x0530u, 0x0000u, CanonicalizeUnique },
-    { 0x0531u, 0x0556u, 0x0030u, CanonicalizeRangeLo },
-    { 0x0557u, 0x0560u, 0x0000u, CanonicalizeUnique },
-    { 0x0561u, 0x0586u, 0x0030u, CanonicalizeRangeHi },
-    { 0x0587u, 0x109fu, 0x0000u, CanonicalizeUnique },
-    { 0x10a0u, 0x10c5u, 0x1c60u, CanonicalizeRangeLo },
-    { 0x10c6u, 0x1d78u, 0x0000u, CanonicalizeUnique },
-    { 0x1d79u, 0x1d79u, 0x8a04u, CanonicalizeRangeLo },
-    { 0x1d7au, 0x1d7cu, 0x0000u, CanonicalizeUnique },
-    { 0x1d7du, 0x1d7du, 0x0ee6u, CanonicalizeRangeLo },
-    { 0x1d7eu, 0x1dffu, 0x0000u, CanonicalizeUnique },
-    { 0x1e00u, 0x1e5fu, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0x1e60u, 0x1e61u, 0x000eu, CanonicalizeSet },
-    { 0x1e62u, 0x1e95u, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0x1e96u, 0x1e9au, 0x0000u, CanonicalizeUnique },
-    { 0x1e9bu, 0x1e9bu, 0x000eu, CanonicalizeSet },
-    { 0x1e9cu, 0x1e9fu, 0x0000u, CanonicalizeUnique },
-    { 0x1ea0u, 0x1effu, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0x1f00u, 0x1f07u, 0x0008u, CanonicalizeRangeLo },
-    { 0x1f08u, 0x1f0fu, 0x0008u, CanonicalizeRangeHi },
-    { 0x1f10u, 0x1f15u, 0x0008u, CanonicalizeRangeLo },
-    { 0x1f16u, 0x1f17u, 0x0000u, CanonicalizeUnique },
-    { 0x1f18u, 0x1f1du, 0x0008u, CanonicalizeRangeHi },
-    { 0x1f1eu, 0x1f1fu, 0x0000u, CanonicalizeUnique },
-    { 0x1f20u, 0x1f27u, 0x0008u, CanonicalizeRangeLo },
-    { 0x1f28u, 0x1f2fu, 0x0008u, CanonicalizeRangeHi },
-    { 0x1f30u, 0x1f37u, 0x0008u, CanonicalizeRangeLo },
-    { 0x1f38u, 0x1f3fu, 0x0008u, CanonicalizeRangeHi },
-    { 0x1f40u, 0x1f45u, 0x0008u, CanonicalizeRangeLo },
-    { 0x1f46u, 0x1f47u, 0x0000u, CanonicalizeUnique },
-    { 0x1f48u, 0x1f4du, 0x0008u, CanonicalizeRangeHi },
-    { 0x1f4eu, 0x1f50u, 0x0000u, CanonicalizeUnique },
-    { 0x1f51u, 0x1f51u, 0x0008u, CanonicalizeRangeLo },
-    { 0x1f52u, 0x1f52u, 0x0000u, CanonicalizeUnique },
-    { 0x1f53u, 0x1f53u, 0x0008u, CanonicalizeRangeLo },
-    { 0x1f54u, 0x1f54u, 0x0000u, CanonicalizeUnique },
-    { 0x1f55u, 0x1f55u, 0x0008u, CanonicalizeRangeLo },
-    { 0x1f56u, 0x1f56u, 0x0000u, CanonicalizeUnique },
-    { 0x1f57u, 0x1f57u, 0x0008u, CanonicalizeRangeLo },
-    { 0x1f58u, 0x1f58u, 0x0000u, CanonicalizeUnique },
-    { 0x1f59u, 0x1f59u, 0x0008u, CanonicalizeRangeHi },
-    { 0x1f5au, 0x1f5au, 0x0000u, CanonicalizeUnique },
-    { 0x1f5bu, 0x1f5bu, 0x0008u, CanonicalizeRangeHi },
-    { 0x1f5cu, 0x1f5cu, 0x0000u, CanonicalizeUnique },
-    { 0x1f5du, 0x1f5du, 0x0008u, CanonicalizeRangeHi },
-    { 0x1f5eu, 0x1f5eu, 0x0000u, CanonicalizeUnique },
-    { 0x1f5fu, 0x1f5fu, 0x0008u, CanonicalizeRangeHi },
-    { 0x1f60u, 0x1f67u, 0x0008u, CanonicalizeRangeLo },
-    { 0x1f68u, 0x1f6fu, 0x0008u, CanonicalizeRangeHi },
-    { 0x1f70u, 0x1f71u, 0x004au, CanonicalizeRangeLo },
-    { 0x1f72u, 0x1f75u, 0x0056u, CanonicalizeRangeLo },
-    { 0x1f76u, 0x1f77u, 0x0064u, CanonicalizeRangeLo },
-    { 0x1f78u, 0x1f79u, 0x0080u, CanonicalizeRangeLo },
-    { 0x1f7au, 0x1f7bu, 0x0070u, CanonicalizeRangeLo },
-    { 0x1f7cu, 0x1f7du, 0x007eu, CanonicalizeRangeLo },
-    { 0x1f7eu, 0x1fafu, 0x0000u, CanonicalizeUnique },
-    { 0x1fb0u, 0x1fb1u, 0x0008u, CanonicalizeRangeLo },
-    { 0x1fb2u, 0x1fb7u, 0x0000u, CanonicalizeUnique },
-    { 0x1fb8u, 0x1fb9u, 0x0008u, CanonicalizeRangeHi },
-    { 0x1fbau, 0x1fbbu, 0x004au, CanonicalizeRangeHi },
-    { 0x1fbcu, 0x1fbdu, 0x0000u, CanonicalizeUnique },
-    { 0x1fbeu, 0x1fbeu, 0x0007u, CanonicalizeSet },
-    { 0x1fbfu, 0x1fc7u, 0x0000u, CanonicalizeUnique },
-    { 0x1fc8u, 0x1fcbu, 0x0056u, CanonicalizeRangeHi },
-    { 0x1fccu, 0x1fcfu, 0x0000u, CanonicalizeUnique },
-    { 0x1fd0u, 0x1fd1u, 0x0008u, CanonicalizeRangeLo },
-    { 0x1fd2u, 0x1fd7u, 0x0000u, CanonicalizeUnique },
-    { 0x1fd8u, 0x1fd9u, 0x0008u, CanonicalizeRangeHi },
-    { 0x1fdau, 0x1fdbu, 0x0064u, CanonicalizeRangeHi },
-    { 0x1fdcu, 0x1fdfu, 0x0000u, CanonicalizeUnique },
-    { 0x1fe0u, 0x1fe1u, 0x0008u, CanonicalizeRangeLo },
-    { 0x1fe2u, 0x1fe4u, 0x0000u, CanonicalizeUnique },
-    { 0x1fe5u, 0x1fe5u, 0x0007u, CanonicalizeRangeLo },
-    { 0x1fe6u, 0x1fe7u, 0x0000u, CanonicalizeUnique },
-    { 0x1fe8u, 0x1fe9u, 0x0008u, CanonicalizeRangeHi },
-    { 0x1feau, 0x1febu, 0x0070u, CanonicalizeRangeHi },
-    { 0x1fecu, 0x1fecu, 0x0007u, CanonicalizeRangeHi },
-    { 0x1fedu, 0x1ff7u, 0x0000u, CanonicalizeUnique },
-    { 0x1ff8u, 0x1ff9u, 0x0080u, CanonicalizeRangeHi },
-    { 0x1ffau, 0x1ffbu, 0x007eu, CanonicalizeRangeHi },
-    { 0x1ffcu, 0x2131u, 0x0000u, CanonicalizeUnique },
-    { 0x2132u, 0x2132u, 0x001cu, CanonicalizeRangeLo },
-    { 0x2133u, 0x214du, 0x0000u, CanonicalizeUnique },
-    { 0x214eu, 0x214eu, 0x001cu, CanonicalizeRangeHi },
-    { 0x214fu, 0x215fu, 0x0000u, CanonicalizeUnique },
-    { 0x2160u, 0x216fu, 0x0010u, CanonicalizeRangeLo },
-    { 0x2170u, 0x217fu, 0x0010u, CanonicalizeRangeHi },
-    { 0x2180u, 0x2182u, 0x0000u, CanonicalizeUnique },
-    { 0x2183u, 0x2184u, 0x0000u, CanonicalizeAlternatingUnaligned },
-    { 0x2185u, 0x24b5u, 0x0000u, CanonicalizeUnique },
-    { 0x24b6u, 0x24cfu, 0x001au, CanonicalizeRangeLo },
-    { 0x24d0u, 0x24e9u, 0x001au, CanonicalizeRangeHi },
-    { 0x24eau, 0x2bffu, 0x0000u, CanonicalizeUnique },
-    { 0x2c00u, 0x2c2eu, 0x0030u, CanonicalizeRangeLo },
-    { 0x2c2fu, 0x2c2fu, 0x0000u, CanonicalizeUnique },
-    { 0x2c30u, 0x2c5eu, 0x0030u, CanonicalizeRangeHi },
-    { 0x2c5fu, 0x2c5fu, 0x0000u, CanonicalizeUnique },
-    { 0x2c60u, 0x2c61u, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0x2c62u, 0x2c62u, 0x29f7u, CanonicalizeRangeHi },
-    { 0x2c63u, 0x2c63u, 0x0ee6u, CanonicalizeRangeHi },
-    { 0x2c64u, 0x2c64u, 0x29e7u, CanonicalizeRangeHi },
-    { 0x2c65u, 0x2c65u, 0x2a2bu, CanonicalizeRangeHi },
-    { 0x2c66u, 0x2c66u, 0x2a28u, CanonicalizeRangeHi },
-    { 0x2c67u, 0x2c6cu, 0x0000u, CanonicalizeAlternatingUnaligned },
-    { 0x2c6du, 0x2c6du, 0x2a1cu, CanonicalizeRangeHi },
-    { 0x2c6eu, 0x2c6eu, 0x29fdu, CanonicalizeRangeHi },
-    { 0x2c6fu, 0x2c6fu, 0x2a1fu, CanonicalizeRangeHi },
-    { 0x2c70u, 0x2c70u, 0x2a1eu, CanonicalizeRangeHi },
-    { 0x2c71u, 0x2c71u, 0x0000u, CanonicalizeUnique },
-    { 0x2c72u, 0x2c73u, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0x2c74u, 0x2c74u, 0x0000u, CanonicalizeUnique },
-    { 0x2c75u, 0x2c76u, 0x0000u, CanonicalizeAlternatingUnaligned },
-    { 0x2c77u, 0x2c7du, 0x0000u, CanonicalizeUnique },
-    { 0x2c7eu, 0x2c7fu, 0x2a3fu, CanonicalizeRangeHi },
-    { 0x2c80u, 0x2ce3u, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0x2ce4u, 0x2ceau, 0x0000u, CanonicalizeUnique },
-    { 0x2cebu, 0x2ceeu, 0x0000u, CanonicalizeAlternatingUnaligned },
-    { 0x2cefu, 0x2cffu, 0x0000u, CanonicalizeUnique },
-    { 0x2d00u, 0x2d25u, 0x1c60u, CanonicalizeRangeHi },
-    { 0x2d26u, 0xa63fu, 0x0000u, CanonicalizeUnique },
-    { 0xa640u, 0xa66du, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0xa66eu, 0xa67fu, 0x0000u, CanonicalizeUnique },
-    { 0xa680u, 0xa697u, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0xa698u, 0xa721u, 0x0000u, CanonicalizeUnique },
-    { 0xa722u, 0xa72fu, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0xa730u, 0xa731u, 0x0000u, CanonicalizeUnique },
-    { 0xa732u, 0xa76fu, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0xa770u, 0xa778u, 0x0000u, CanonicalizeUnique },
-    { 0xa779u, 0xa77cu, 0x0000u, CanonicalizeAlternatingUnaligned },
-    { 0xa77du, 0xa77du, 0x8a04u, CanonicalizeRangeHi },
-    { 0xa77eu, 0xa787u, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0xa788u, 0xa78au, 0x0000u, CanonicalizeUnique },
-    { 0xa78bu, 0xa78cu, 0x0000u, CanonicalizeAlternatingUnaligned },
-    { 0xa78du, 0xa78du, 0xa528u, CanonicalizeRangeHi },
-    { 0xa78eu, 0xa78fu, 0x0000u, CanonicalizeUnique },
-    { 0xa790u, 0xa791u, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0xa792u, 0xa79fu, 0x0000u, CanonicalizeUnique },
-    { 0xa7a0u, 0xa7a9u, 0x0000u, CanonicalizeAlternatingAligned },
-    { 0xa7aau, 0xff20u, 0x0000u, CanonicalizeUnique },
-    { 0xff21u, 0xff3au, 0x0020u, CanonicalizeRangeLo },
-    { 0xff3bu, 0xff40u, 0x0000u, CanonicalizeUnique },
-    { 0xff41u, 0xff5au, 0x0020u, CanonicalizeRangeHi },
-    { 0xff5bu, 0xffffu, 0x0000u, CanonicalizeUnique },
-};
-
-const size_t LATIN_CANONICALIZATION_RANGES = 20;
-LatinCanonicalizationRange latinRangeInfo[LATIN_CANONICALIZATION_RANGES] = {
-    { 0x0000u, 0x0040u, 0x0000u, CanonicalizeLatinSelf },
-    { 0x0041u, 0x005au, 0x0000u, CanonicalizeLatinMask0x20 },
-    { 0x005bu, 0x0060u, 0x0000u, CanonicalizeLatinSelf },
-    { 0x0061u, 0x007au, 0x0000u, CanonicalizeLatinMask0x20 },
-    { 0x007bu, 0x00bfu, 0x0000u, CanonicalizeLatinSelf },
-    { 0x00c0u, 0x00d6u, 0x0000u, CanonicalizeLatinMask0x20 },
-    { 0x00d7u, 0x00d7u, 0x0000u, CanonicalizeLatinSelf },
-    { 0x00d8u, 0x00deu, 0x0000u, CanonicalizeLatinMask0x20 },
-    { 0x00dfu, 0x00dfu, 0x0000u, CanonicalizeLatinSelf },
-    { 0x00e0u, 0x00f6u, 0x0000u, CanonicalizeLatinMask0x20 },
-    { 0x00f7u, 0x00f7u, 0x0000u, CanonicalizeLatinSelf },
-    { 0x00f8u, 0x00feu, 0x0000u, CanonicalizeLatinMask0x20 },
-    { 0x00ffu, 0x00ffu, 0x0000u, CanonicalizeLatinSelf },
-    { 0x0100u, 0x0177u, 0x0000u, CanonicalizeLatinInvalid },
-    { 0x0178u, 0x0178u, 0x00ffu, CanonicalizeLatinOther },
-    { 0x0179u, 0x039bu, 0x0000u, CanonicalizeLatinInvalid },
-    { 0x039cu, 0x039cu, 0x00b5u, CanonicalizeLatinOther },
-    { 0x039du, 0x03bbu, 0x0000u, CanonicalizeLatinInvalid },
-    { 0x03bcu, 0x03bcu, 0x00b5u, CanonicalizeLatinOther },
-    { 0x03bdu, 0xffffu, 0x0000u, CanonicalizeLatinInvalid },
-};
-
-} } // JSC::Yarr
-
</del></span></pre></div>
<a id="trunkSourceJavaScriptCoreyarrYarrCanonicalizeUCS2h"></a>
<div class="delfile"><h4>Deleted: trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUCS2.h (197425 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUCS2.h        2016-03-02 00:31:47 UTC (rev 197425)
+++ trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUCS2.h        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -1,138 +0,0 @@
</span><del>-/*
- * Copyright (C) 2012 Apple Inc. All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- * 1. Redistributions of source code must retain the above copyright
- *    notice, this list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright
- *    notice, this list of conditions and the following disclaimer in the
- *    documentation and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
- * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
- * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
- * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL APPLE INC. OR
- * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
- * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
- * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
- * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
- * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 
- */
-
-#ifndef YarrCanonicalizeUCS2_H
-#define YarrCanonicalizeUCS2_H
-
-#include &lt;stdint.h&gt;
-#include &lt;unicode/utypes.h&gt;
-
-namespace JSC { namespace Yarr {
-
-// This set of data (autogenerated using YarrCanonicalizeUCS2.js into YarrCanonicalizeUCS2.cpp)
-// provides information for each UCS2 code point as to the set of code points that it should
-// match under the ES5.1 case insensitive RegExp matching rules, specified in 15.10.2.8.
-enum UCS2CanonicalizationType {
-    CanonicalizeUnique,               // No canonically equal values, e.g. 0x0.
-    CanonicalizeSet,                  // Value indicates a set in characterSetInfo.
-    CanonicalizeRangeLo,              // Value is positive delta to pair, E.g. 0x41 has value 0x20, -&gt; 0x61.
-    CanonicalizeRangeHi,              // Value is positive delta to pair, E.g. 0x61 has value 0x20, -&gt; 0x41.
-    CanonicalizeAlternatingAligned,   // Aligned consequtive pair, e.g. 0x1f4,0x1f5.
-    CanonicalizeAlternatingUnaligned, // Unaligned consequtive pair, e.g. 0x241,0x242.
-};
-struct UCS2CanonicalizationRange { uint16_t begin, end, value, type; };
-extern const size_t UCS2_CANONICALIZATION_RANGES;
-extern const uint16_t* const characterSetInfo[];
-extern const UCS2CanonicalizationRange rangeInfo[];
-
-// This table is similar to the full rangeInfo table, however this maps from UCS2 codepoints to
-// the set of Latin1 codepoints that could match.
-enum LatinCanonicalizationType {
-    CanonicalizeLatinSelf,     // This character is in the Latin1 range, but has no canonical equivalent in the range.
-    CanonicalizeLatinMask0x20, // One of a pair of characters, under the mask 0x20.
-    CanonicalizeLatinOther,    // This character is not in the Latin1 range, but canonicalizes to another that is.
-    CanonicalizeLatinInvalid,  // Cannot match against Latin1 input.
-};
-struct LatinCanonicalizationRange { uint16_t begin, end, value, type; };
-extern const size_t LATIN_CANONICALIZATION_RANGES;
-extern LatinCanonicalizationRange latinRangeInfo[];
-
-// This searches in log2 time over ~364 entries, so should typically result in 8 compares.
-inline const UCS2CanonicalizationRange* rangeInfoFor(UChar ch)
-{
-    const UCS2CanonicalizationRange* info = rangeInfo;
-    size_t entries = UCS2_CANONICALIZATION_RANGES;
-
-    while (true) {
-        size_t candidate = entries &gt;&gt; 1;
-        const UCS2CanonicalizationRange* candidateInfo = info + candidate;
-        if (ch &lt; candidateInfo-&gt;begin)
-            entries = candidate;
-        else if (ch &lt;= candidateInfo-&gt;end)
-            return candidateInfo;
-        else {
-            info = candidateInfo + 1;
-            entries -= (candidate + 1);
-        }
-    }
-}
-
-// Should only be called for characters that have one canonically matching value.
-inline UChar getCanonicalPair(const UCS2CanonicalizationRange* info, UChar ch)
-{
-    ASSERT(ch &gt;= info-&gt;begin &amp;&amp; ch &lt;= info-&gt;end);
-    switch (info-&gt;type) {
-    case CanonicalizeRangeLo:
-        return ch + info-&gt;value;
-    case CanonicalizeRangeHi:
-        return ch - info-&gt;value;
-    case CanonicalizeAlternatingAligned:
-        return ch ^ 1;
-    case CanonicalizeAlternatingUnaligned:
-        return ((ch - 1) ^ 1) + 1;
-    default:
-        RELEASE_ASSERT_NOT_REACHED();
-    }
-    RELEASE_ASSERT_NOT_REACHED();
-    return 0;
-}
-
-// Returns true if no other UCS2 codepoint can match this value.
-inline bool isCanonicallyUnique(UChar ch)
-{
-    return rangeInfoFor(ch)-&gt;type == CanonicalizeUnique;
-}
-
-// Returns true if values are equal, under the canonicalization rules.
-inline bool areCanonicallyEquivalent(UChar a, UChar b)
-{
-    const UCS2CanonicalizationRange* info = rangeInfoFor(a);
-    switch (info-&gt;type) {
-    case CanonicalizeUnique:
-        return a == b;
-    case CanonicalizeSet: {
-        for (const uint16_t* set = characterSetInfo[info-&gt;value]; (a = *set); ++set) {
-            if (a == b)
-                return true;
-        }
-        return false;
-    }
-    case CanonicalizeRangeLo:
-        return (a == b) || (a + info-&gt;value == b);
-    case CanonicalizeRangeHi:
-        return (a == b) || (a - info-&gt;value == b);
-    case CanonicalizeAlternatingAligned:
-        return (a | 1) == (b | 1);
-    case CanonicalizeAlternatingUnaligned:
-        return ((a - 1) | 1) == ((b - 1) | 1);
-    }
-
-    RELEASE_ASSERT_NOT_REACHED();
-    return false;
-}
-
-} } // JSC::Yarr
-
-#endif
</del></span></pre></div>
<a id="trunkSourceJavaScriptCoreyarrYarrCanonicalizeUCS2js"></a>
<div class="delfile"><h4>Deleted: trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUCS2.js (197425 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUCS2.js        2016-03-02 00:31:47 UTC (rev 197425)
+++ trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUCS2.js        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -1,219 +0,0 @@
</span><del>-/*
- * Copyright (C) 2012 Apple Inc. All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- * 1. Redistributions of source code must retain the above copyright
- *    notice, this list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright
- *    notice, this list of conditions and the following disclaimer in the
- *    documentation and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
- * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
- * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
- * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL APPLE INC. OR
- * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
- * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
- * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
- * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
- * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 
- */
-
-// See ES 5.1, 15.10.2.8
-function canonicalize(ch)
-{
-    var u = String.fromCharCode(ch).toUpperCase();
-    if (u.length &gt; 1)
-        return ch;
-    var cu = u.charCodeAt(0);
-    if (ch &gt;= 128 &amp;&amp; cu &lt; 128)
-        return ch;
-    return cu;
-}
-
-var MAX_UCS2 = 0xFFFF;
-var MAX_LATIN = 0xFF;
-
-var groupedCanonically = [];
-// Pass 1: populate groupedCanonically - this is mapping from canonicalized
-// values back to the set of character code that canonicalize to them.
-for (var i = 0; i &lt;= MAX_UCS2; ++i) {
-    var ch = canonicalize(i);
-    if (!groupedCanonically[ch])
-        groupedCanonically[ch] = [];
-    groupedCanonically[ch].push(i);
-}
-
-var typeInfo = [];
-var latinTypeInfo = [];
-var characterSetInfo = [];
-// Pass 2: populate typeInfo &amp; characterSetInfo. For every character calculate
-// a typeInfo value, described by the types above, and a value payload.
-for (cu in groupedCanonically) {
-    // The set of characters that canonicalize to cu
-    var characters = groupedCanonically[cu];
-
-    // If there is only one, it is unique.
-    if (characters.length == 1) {
-        typeInfo[characters[0]] = &quot;CanonicalizeUnique:0&quot;;
-        latinTypeInfo[characters[0]] = characters[0] &lt;= MAX_LATIN ? &quot;CanonicalizeLatinSelf:0&quot; : &quot;CanonicalizeLatinInvalid:0&quot;;
-        continue;
-    }
-
-    // Sort the array.
-    characters.sort(function(x,y){return x-y;});
-
-    // If there are more than two characters, create an entry in characterSetInfo.
-    if (characters.length &gt; 2) {
-        for (i in characters)
-            typeInfo[characters[i]] = &quot;CanonicalizeSet:&quot; + characterSetInfo.length;
-        characterSetInfo.push(characters);
-
-        if (characters[1] &lt;= MAX_LATIN)
-            throw new Error(&quot;sets with more than one latin character not supported!&quot;);
-        if (characters[0] &lt;= MAX_LATIN) {
-            for (i in characters)
-                latinTypeInfo[characters[i]] = &quot;CanonicalizeLatinOther:&quot; + characters[0];
-            latinTypeInfo[characters[0]] = &quot;CanonicalizeLatinSelf:0&quot;;
-        } else {
-            for (i in characters)
-                latinTypeInfo[characters[i]] = &quot;CanonicalizeLatinInvalid:0&quot;;
-        }
-
-        continue;
-    }
-
-    // We have a pair, mark alternating ranges, otherwise track whether this is the low or high partner.
-    var lo = characters[0];
-    var hi = characters[1];
-    var delta = hi - lo;
-    if (delta == 1) {
-        var type = lo &amp; 1 ? &quot;CanonicalizeAlternatingUnaligned:0&quot; : &quot;CanonicalizeAlternatingAligned:0&quot;;
-        typeInfo[lo] = type;
-        typeInfo[hi] = type;
-    } else {
-        typeInfo[lo] = &quot;CanonicalizeRangeLo:&quot; + delta;
-        typeInfo[hi] = &quot;CanonicalizeRangeHi:&quot; + delta;
-    }
-
-    if (lo &gt; MAX_LATIN) {
-        latinTypeInfo[lo] = &quot;CanonicalizeLatinInvalid:0&quot;; 
-        latinTypeInfo[hi] = &quot;CanonicalizeLatinInvalid:0&quot;;
-    } else if (hi &gt; MAX_LATIN) {
-        latinTypeInfo[lo] = &quot;CanonicalizeLatinSelf:0&quot;; 
-        latinTypeInfo[hi] = &quot;CanonicalizeLatinOther:&quot; + lo;
-    } else {
-        if (delta != 0x20 || lo &amp; 0x20)
-            throw new Error(&quot;pairs of latin characters that don't mask with 0x20 not supported!&quot;);
-        latinTypeInfo[lo] = &quot;CanonicalizeLatinMask0x20:0&quot;;
-        latinTypeInfo[hi] = &quot;CanonicalizeLatinMask0x20:0&quot;;
-    }
-}
-
-var rangeInfo = [];
-// Pass 3: coallesce types into ranges.
-for (var end = 0; end &lt;= MAX_UCS2; ++end) {
-    var begin = end;
-    var type = typeInfo[end];
-    while (end &lt; MAX_UCS2 &amp;&amp; typeInfo[end + 1] == type)
-        ++end;
-    rangeInfo.push({begin:begin, end:end, type:type});
-}
-
-var latinRangeInfo = [];
-// Pass 4: coallesce latin-1 types into ranges.
-for (var end = 0; end &lt;= MAX_UCS2; ++end) {
-    var begin = end;
-    var type = latinTypeInfo[end];
-    while (end &lt; MAX_UCS2 &amp;&amp; latinTypeInfo[end + 1] == type)
-        ++end;
-    latinRangeInfo.push({begin:begin, end:end, type:type});
-}
-
-
-// Helper function to convert a number to a fixed width hex representation of a C uint16_t.
-function hex(x)
-{
-    var s = Number(x).toString(16);
-    while (s.length &lt; 4)
-        s = 0 + s;
-    return &quot;0x&quot; + s + &quot;u&quot;;
-}
-
-var copyright = (
-    &quot;/*&quot;                                                                            + &quot;\n&quot; +
-    &quot; * Copyright (C) 2012 Apple Inc. All rights reserved.&quot;                         + &quot;\n&quot; +
-    &quot; *&quot;                                                                            + &quot;\n&quot; +
-    &quot; * Redistribution and use in source and binary forms, with or without&quot;         + &quot;\n&quot; +
-    &quot; * modification, are permitted provided that the following conditions&quot;         + &quot;\n&quot; +
-    &quot; * are met:&quot;                                                                   + &quot;\n&quot; +
-    &quot; * 1. Redistributions of source code must retain the above copyright&quot;          + &quot;\n&quot; +
-    &quot; *    notice, this list of conditions and the following disclaimer.&quot;           + &quot;\n&quot; +
-    &quot; * 2. Redistributions in binary form must reproduce the above copyright&quot;       + &quot;\n&quot; +
-    &quot; *    notice, this list of conditions and the following disclaimer in the&quot;     + &quot;\n&quot; +
-    &quot; *    documentation and/or other materials provided with the distribution.&quot;    + &quot;\n&quot; +
-    &quot; *&quot;                                                                            + &quot;\n&quot; +
-    &quot; * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY&quot;                  + &quot;\n&quot; +
-    &quot; * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE&quot;          + &quot;\n&quot; +
-    &quot; * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR&quot;         + &quot;\n&quot; +
-    &quot; * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL APPLE INC. OR&quot;                   + &quot;\n&quot; +
-    &quot; * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,&quot;      + &quot;\n&quot; +
-    &quot; * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,&quot;        + &quot;\n&quot; +
-    &quot; * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR&quot;         + &quot;\n&quot; +
-    &quot; * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY&quot;        + &quot;\n&quot; +
-    &quot; * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT&quot;               + &quot;\n&quot; +
-    &quot; * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE&quot;      + &quot;\n&quot; +
-    &quot; * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. &quot;      + &quot;\n&quot; +
-    &quot; */&quot;);
-
-print(copyright);
-print();
-print(&quot;// DO NOT EDIT! - this file autogenerated by YarrCanonicalizeUCS2.js&quot;);
-print();
-print('#include &quot;config.h&quot;');
-print('#include &quot;YarrCanonicalizeUCS2.h&quot;');
-print();
-print(&quot;namespace JSC { namespace Yarr {&quot;);
-print();
-print(&quot;#include &lt;stdint.h&gt;&quot;);
-print();
-
-for (i in characterSetInfo) {
-    var characters = &quot;&quot;
-    var set = characterSetInfo[i];
-    for (var j in set)
-        characters += hex(set[j]) + &quot;, &quot;;
-    print(&quot;uint16_t ucs2CharacterSet&quot; + i + &quot;[] = { &quot; + characters + &quot;0 };&quot;);
-}
-print();
-print(&quot;static const size_t UCS2_CANONICALIZATION_SETS = &quot; + characterSetInfo.length + &quot;;&quot;);
-print(&quot;uint16_t* characterSetInfo[UCS2_CANONICALIZATION_SETS] = {&quot;);
-for (i in characterSetInfo)
-print(&quot;    ucs2CharacterSet&quot; + i + &quot;,&quot;);
-print(&quot;};&quot;);
-print();
-print(&quot;const size_t UCS2_CANONICALIZATION_RANGES = &quot; + rangeInfo.length + &quot;;&quot;);
-print(&quot;UCS2CanonicalizationRange rangeInfo[UCS2_CANONICALIZATION_RANGES] = {&quot;);
-for (i in rangeInfo) {
-    var info = rangeInfo[i];
-    var typeAndValue = info.type.split(':');
-    print(&quot;    { &quot; + hex(info.begin) + &quot;, &quot; + hex(info.end) + &quot;, &quot; + hex(typeAndValue[1]) + &quot;, &quot; + typeAndValue[0] + &quot; },&quot;);
-}
-print(&quot;};&quot;);
-print();
-print(&quot;const size_t LATIN_CANONICALIZATION_RANGES = &quot; + latinRangeInfo.length + &quot;;&quot;);
-print(&quot;LatinCanonicalizationRange latinRangeInfo[LATIN_CANONICALIZATION_RANGES] = {&quot;);
-for (i in latinRangeInfo) {
-    var info = latinRangeInfo[i];
-    var typeAndValue = info.type.split(':');
-    print(&quot;    { &quot; + hex(info.begin) + &quot;, &quot; + hex(info.end) + &quot;, &quot; + hex(typeAndValue[1]) + &quot;, &quot; + typeAndValue[0] + &quot; },&quot;);
-}
-print(&quot;};&quot;);
-print();
-print(&quot;} } // JSC::Yarr&quot;);
-print();
-
</del></span></pre></div>
<a id="trunkSourceJavaScriptCoreyarrYarrCanonicalizeUnicodecppfromrev197165trunkSourceJavaScriptCoreyarrYarrCanonicalizeUCS2cpp"></a>
<div class="copfile"><h4>Copied: trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.cpp (from rev 197165, trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUCS2.cpp) (0 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.cpp                                (rev 0)
+++ trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.cpp        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -0,0 +1,1182 @@
</span><ins>+/*
+ * Copyright (C) 2012-2013, 2015-2016 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL APPLE INC. OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+ * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
+ * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 
+ */
+
+// DO NOT EDIT! - this file autogenerated by YarrCanonicalizeUnicode.js
+
+#include &quot;config.h&quot;
+#include &quot;YarrCanonicalizeUnicode.h&quot;
+
+namespace JSC { namespace Yarr {
+
+#include &lt;stdint.h&gt;
+
+const UChar32 ucs2CharacterSet0[] = { 0x01c4, 0x01c5, 0x01c6, 0 };
+const UChar32 ucs2CharacterSet1[] = { 0x01c7, 0x01c8, 0x01c9, 0 };
+const UChar32 ucs2CharacterSet2[] = { 0x01ca, 0x01cb, 0x01cc, 0 };
+const UChar32 ucs2CharacterSet3[] = { 0x01f1, 0x01f2, 0x01f3, 0 };
+const UChar32 ucs2CharacterSet4[] = { 0x0392, 0x03b2, 0x03d0, 0 };
+const UChar32 ucs2CharacterSet5[] = { 0x0395, 0x03b5, 0x03f5, 0 };
+const UChar32 ucs2CharacterSet6[] = { 0x0398, 0x03b8, 0x03d1, 0 };
+const UChar32 ucs2CharacterSet7[] = { 0x0345, 0x0399, 0x03b9, 0x1fbe, 0 };
+const UChar32 ucs2CharacterSet8[] = { 0x039a, 0x03ba, 0x03f0, 0 };
+const UChar32 ucs2CharacterSet9[] = { 0x00b5, 0x039c, 0x03bc, 0 };
+const UChar32 ucs2CharacterSet10[] = { 0x03a0, 0x03c0, 0x03d6, 0 };
+const UChar32 ucs2CharacterSet11[] = { 0x03a1, 0x03c1, 0x03f1, 0 };
+const UChar32 ucs2CharacterSet12[] = { 0x03a3, 0x03c2, 0x03c3, 0 };
+const UChar32 ucs2CharacterSet13[] = { 0x03a6, 0x03c6, 0x03d5, 0 };
+const UChar32 ucs2CharacterSet14[] = { 0x1e60, 0x1e61, 0x1e9b, 0 };
+
+static const size_t UCS2_CANONICALIZATION_SETS = 15;
+const UChar32* const ucs2CharacterSetInfo[UCS2_CANONICALIZATION_SETS] = {
+    ucs2CharacterSet0,
+    ucs2CharacterSet1,
+    ucs2CharacterSet2,
+    ucs2CharacterSet3,
+    ucs2CharacterSet4,
+    ucs2CharacterSet5,
+    ucs2CharacterSet6,
+    ucs2CharacterSet7,
+    ucs2CharacterSet8,
+    ucs2CharacterSet9,
+    ucs2CharacterSet10,
+    ucs2CharacterSet11,
+    ucs2CharacterSet12,
+    ucs2CharacterSet13,
+    ucs2CharacterSet14,
+};
+
+const size_t UCS2_CANONICALIZATION_RANGES = 391;
+const CanonicalizationRange ucs2RangeInfo[UCS2_CANONICALIZATION_RANGES] = {
+    { 0x0000, 0x0040, 0x0000, CanonicalizeUnique },
+    { 0x0041, 0x005a, 0x0020, CanonicalizeRangeLo },
+    { 0x005b, 0x0060, 0x0000, CanonicalizeUnique },
+    { 0x0061, 0x007a, 0x0020, CanonicalizeRangeHi },
+    { 0x007b, 0x00b4, 0x0000, CanonicalizeUnique },
+    { 0x00b5, 0x00b5, 0x0009, CanonicalizeSet },
+    { 0x00b6, 0x00bf, 0x0000, CanonicalizeUnique },
+    { 0x00c0, 0x00d6, 0x0020, CanonicalizeRangeLo },
+    { 0x00d7, 0x00d7, 0x0000, CanonicalizeUnique },
+    { 0x00d8, 0x00de, 0x0020, CanonicalizeRangeLo },
+    { 0x00df, 0x00df, 0x0000, CanonicalizeUnique },
+    { 0x00e0, 0x00f6, 0x0020, CanonicalizeRangeHi },
+    { 0x00f7, 0x00f7, 0x0000, CanonicalizeUnique },
+    { 0x00f8, 0x00fe, 0x0020, CanonicalizeRangeHi },
+    { 0x00ff, 0x00ff, 0x0079, CanonicalizeRangeLo },
+    { 0x0100, 0x012f, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x0130, 0x0131, 0x0000, CanonicalizeUnique },
+    { 0x0132, 0x0137, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x0138, 0x0138, 0x0000, CanonicalizeUnique },
+    { 0x0139, 0x0148, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x0149, 0x0149, 0x0000, CanonicalizeUnique },
+    { 0x014a, 0x0177, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x0178, 0x0178, 0x0079, CanonicalizeRangeHi },
+    { 0x0179, 0x017e, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x017f, 0x017f, 0x0000, CanonicalizeUnique },
+    { 0x0180, 0x0180, 0x00c3, CanonicalizeRangeLo },
+    { 0x0181, 0x0181, 0x00d2, CanonicalizeRangeLo },
+    { 0x0182, 0x0185, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x0186, 0x0186, 0x00ce, CanonicalizeRangeLo },
+    { 0x0187, 0x0188, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x0189, 0x018a, 0x00cd, CanonicalizeRangeLo },
+    { 0x018b, 0x018c, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x018d, 0x018d, 0x0000, CanonicalizeUnique },
+    { 0x018e, 0x018e, 0x004f, CanonicalizeRangeLo },
+    { 0x018f, 0x018f, 0x00ca, CanonicalizeRangeLo },
+    { 0x0190, 0x0190, 0x00cb, CanonicalizeRangeLo },
+    { 0x0191, 0x0192, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x0193, 0x0193, 0x00cd, CanonicalizeRangeLo },
+    { 0x0194, 0x0194, 0x00cf, CanonicalizeRangeLo },
+    { 0x0195, 0x0195, 0x0061, CanonicalizeRangeLo },
+    { 0x0196, 0x0196, 0x00d3, CanonicalizeRangeLo },
+    { 0x0197, 0x0197, 0x00d1, CanonicalizeRangeLo },
+    { 0x0198, 0x0199, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x019a, 0x019a, 0x00a3, CanonicalizeRangeLo },
+    { 0x019b, 0x019b, 0x0000, CanonicalizeUnique },
+    { 0x019c, 0x019c, 0x00d3, CanonicalizeRangeLo },
+    { 0x019d, 0x019d, 0x00d5, CanonicalizeRangeLo },
+    { 0x019e, 0x019e, 0x0082, CanonicalizeRangeLo },
+    { 0x019f, 0x019f, 0x00d6, CanonicalizeRangeLo },
+    { 0x01a0, 0x01a5, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x01a6, 0x01a6, 0x00da, CanonicalizeRangeLo },
+    { 0x01a7, 0x01a8, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x01a9, 0x01a9, 0x00da, CanonicalizeRangeLo },
+    { 0x01aa, 0x01ab, 0x0000, CanonicalizeUnique },
+    { 0x01ac, 0x01ad, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x01ae, 0x01ae, 0x00da, CanonicalizeRangeLo },
+    { 0x01af, 0x01b0, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x01b1, 0x01b2, 0x00d9, CanonicalizeRangeLo },
+    { 0x01b3, 0x01b6, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x01b7, 0x01b7, 0x00db, CanonicalizeRangeLo },
+    { 0x01b8, 0x01b9, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x01ba, 0x01bb, 0x0000, CanonicalizeUnique },
+    { 0x01bc, 0x01bd, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x01be, 0x01be, 0x0000, CanonicalizeUnique },
+    { 0x01bf, 0x01bf, 0x0038, CanonicalizeRangeLo },
+    { 0x01c0, 0x01c3, 0x0000, CanonicalizeUnique },
+    { 0x01c4, 0x01c6, 0x0000, CanonicalizeSet },
+    { 0x01c7, 0x01c9, 0x0001, CanonicalizeSet },
+    { 0x01ca, 0x01cc, 0x0002, CanonicalizeSet },
+    { 0x01cd, 0x01dc, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x01dd, 0x01dd, 0x004f, CanonicalizeRangeHi },
+    { 0x01de, 0x01ef, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x01f0, 0x01f0, 0x0000, CanonicalizeUnique },
+    { 0x01f1, 0x01f3, 0x0003, CanonicalizeSet },
+    { 0x01f4, 0x01f5, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x01f6, 0x01f6, 0x0061, CanonicalizeRangeHi },
+    { 0x01f7, 0x01f7, 0x0038, CanonicalizeRangeHi },
+    { 0x01f8, 0x021f, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x0220, 0x0220, 0x0082, CanonicalizeRangeHi },
+    { 0x0221, 0x0221, 0x0000, CanonicalizeUnique },
+    { 0x0222, 0x0233, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x0234, 0x0239, 0x0000, CanonicalizeUnique },
+    { 0x023a, 0x023a, 0x2a2b, CanonicalizeRangeLo },
+    { 0x023b, 0x023c, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x023d, 0x023d, 0x00a3, CanonicalizeRangeHi },
+    { 0x023e, 0x023e, 0x2a28, CanonicalizeRangeLo },
+    { 0x023f, 0x0240, 0x2a3f, CanonicalizeRangeLo },
+    { 0x0241, 0x0242, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x0243, 0x0243, 0x00c3, CanonicalizeRangeHi },
+    { 0x0244, 0x0244, 0x0045, CanonicalizeRangeLo },
+    { 0x0245, 0x0245, 0x0047, CanonicalizeRangeLo },
+    { 0x0246, 0x024f, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x0250, 0x0250, 0x2a1f, CanonicalizeRangeLo },
+    { 0x0251, 0x0251, 0x2a1c, CanonicalizeRangeLo },
+    { 0x0252, 0x0252, 0x2a1e, CanonicalizeRangeLo },
+    { 0x0253, 0x0253, 0x00d2, CanonicalizeRangeHi },
+    { 0x0254, 0x0254, 0x00ce, CanonicalizeRangeHi },
+    { 0x0255, 0x0255, 0x0000, CanonicalizeUnique },
+    { 0x0256, 0x0257, 0x00cd, CanonicalizeRangeHi },
+    { 0x0258, 0x0258, 0x0000, CanonicalizeUnique },
+    { 0x0259, 0x0259, 0x00ca, CanonicalizeRangeHi },
+    { 0x025a, 0x025a, 0x0000, CanonicalizeUnique },
+    { 0x025b, 0x025b, 0x00cb, CanonicalizeRangeHi },
+    { 0x025c, 0x025c, 0xa54f, CanonicalizeRangeLo },
+    { 0x025d, 0x025f, 0x0000, CanonicalizeUnique },
+    { 0x0260, 0x0260, 0x00cd, CanonicalizeRangeHi },
+    { 0x0261, 0x0261, 0xa54b, CanonicalizeRangeLo },
+    { 0x0262, 0x0262, 0x0000, CanonicalizeUnique },
+    { 0x0263, 0x0263, 0x00cf, CanonicalizeRangeHi },
+    { 0x0264, 0x0264, 0x0000, CanonicalizeUnique },
+    { 0x0265, 0x0265, 0xa528, CanonicalizeRangeLo },
+    { 0x0266, 0x0266, 0xa544, CanonicalizeRangeLo },
+    { 0x0267, 0x0267, 0x0000, CanonicalizeUnique },
+    { 0x0268, 0x0268, 0x00d1, CanonicalizeRangeHi },
+    { 0x0269, 0x0269, 0x00d3, CanonicalizeRangeHi },
+    { 0x026a, 0x026a, 0x0000, CanonicalizeUnique },
+    { 0x026b, 0x026b, 0x29f7, CanonicalizeRangeLo },
+    { 0x026c, 0x026c, 0xa541, CanonicalizeRangeLo },
+    { 0x026d, 0x026e, 0x0000, CanonicalizeUnique },
+    { 0x026f, 0x026f, 0x00d3, CanonicalizeRangeHi },
+    { 0x0270, 0x0270, 0x0000, CanonicalizeUnique },
+    { 0x0271, 0x0271, 0x29fd, CanonicalizeRangeLo },
+    { 0x0272, 0x0272, 0x00d5, CanonicalizeRangeHi },
+    { 0x0273, 0x0274, 0x0000, CanonicalizeUnique },
+    { 0x0275, 0x0275, 0x00d6, CanonicalizeRangeHi },
+    { 0x0276, 0x027c, 0x0000, CanonicalizeUnique },
+    { 0x027d, 0x027d, 0x29e7, CanonicalizeRangeLo },
+    { 0x027e, 0x027f, 0x0000, CanonicalizeUnique },
+    { 0x0280, 0x0280, 0x00da, CanonicalizeRangeHi },
+    { 0x0281, 0x0282, 0x0000, CanonicalizeUnique },
+    { 0x0283, 0x0283, 0x00da, CanonicalizeRangeHi },
+    { 0x0284, 0x0286, 0x0000, CanonicalizeUnique },
+    { 0x0287, 0x0287, 0xa52a, CanonicalizeRangeLo },
+    { 0x0288, 0x0288, 0x00da, CanonicalizeRangeHi },
+    { 0x0289, 0x0289, 0x0045, CanonicalizeRangeHi },
+    { 0x028a, 0x028b, 0x00d9, CanonicalizeRangeHi },
+    { 0x028c, 0x028c, 0x0047, CanonicalizeRangeHi },
+    { 0x028d, 0x0291, 0x0000, CanonicalizeUnique },
+    { 0x0292, 0x0292, 0x00db, CanonicalizeRangeHi },
+    { 0x0293, 0x029d, 0x0000, CanonicalizeUnique },
+    { 0x029e, 0x029e, 0xa512, CanonicalizeRangeLo },
+    { 0x029f, 0x0344, 0x0000, CanonicalizeUnique },
+    { 0x0345, 0x0345, 0x0007, CanonicalizeSet },
+    { 0x0346, 0x036f, 0x0000, CanonicalizeUnique },
+    { 0x0370, 0x0373, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x0374, 0x0375, 0x0000, CanonicalizeUnique },
+    { 0x0376, 0x0377, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x0378, 0x037a, 0x0000, CanonicalizeUnique },
+    { 0x037b, 0x037d, 0x0082, CanonicalizeRangeLo },
+    { 0x037e, 0x037e, 0x0000, CanonicalizeUnique },
+    { 0x037f, 0x037f, 0x0074, CanonicalizeRangeLo },
+    { 0x0380, 0x0385, 0x0000, CanonicalizeUnique },
+    { 0x0386, 0x0386, 0x0026, CanonicalizeRangeLo },
+    { 0x0387, 0x0387, 0x0000, CanonicalizeUnique },
+    { 0x0388, 0x038a, 0x0025, CanonicalizeRangeLo },
+    { 0x038b, 0x038b, 0x0000, CanonicalizeUnique },
+    { 0x038c, 0x038c, 0x0040, CanonicalizeRangeLo },
+    { 0x038d, 0x038d, 0x0000, CanonicalizeUnique },
+    { 0x038e, 0x038f, 0x003f, CanonicalizeRangeLo },
+    { 0x0390, 0x0390, 0x0000, CanonicalizeUnique },
+    { 0x0391, 0x0391, 0x0020, CanonicalizeRangeLo },
+    { 0x0392, 0x0392, 0x0004, CanonicalizeSet },
+    { 0x0393, 0x0394, 0x0020, CanonicalizeRangeLo },
+    { 0x0395, 0x0395, 0x0005, CanonicalizeSet },
+    { 0x0396, 0x0397, 0x0020, CanonicalizeRangeLo },
+    { 0x0398, 0x0398, 0x0006, CanonicalizeSet },
+    { 0x0399, 0x0399, 0x0007, CanonicalizeSet },
+    { 0x039a, 0x039a, 0x0008, CanonicalizeSet },
+    { 0x039b, 0x039b, 0x0020, CanonicalizeRangeLo },
+    { 0x039c, 0x039c, 0x0009, CanonicalizeSet },
+    { 0x039d, 0x039f, 0x0020, CanonicalizeRangeLo },
+    { 0x03a0, 0x03a0, 0x000a, CanonicalizeSet },
+    { 0x03a1, 0x03a1, 0x000b, CanonicalizeSet },
+    { 0x03a2, 0x03a2, 0x0000, CanonicalizeUnique },
+    { 0x03a3, 0x03a3, 0x000c, CanonicalizeSet },
+    { 0x03a4, 0x03a5, 0x0020, CanonicalizeRangeLo },
+    { 0x03a6, 0x03a6, 0x000d, CanonicalizeSet },
+    { 0x03a7, 0x03ab, 0x0020, CanonicalizeRangeLo },
+    { 0x03ac, 0x03ac, 0x0026, CanonicalizeRangeHi },
+    { 0x03ad, 0x03af, 0x0025, CanonicalizeRangeHi },
+    { 0x03b0, 0x03b0, 0x0000, CanonicalizeUnique },
+    { 0x03b1, 0x03b1, 0x0020, CanonicalizeRangeHi },
+    { 0x03b2, 0x03b2, 0x0004, CanonicalizeSet },
+    { 0x03b3, 0x03b4, 0x0020, CanonicalizeRangeHi },
+    { 0x03b5, 0x03b5, 0x0005, CanonicalizeSet },
+    { 0x03b6, 0x03b7, 0x0020, CanonicalizeRangeHi },
+    { 0x03b8, 0x03b8, 0x0006, CanonicalizeSet },
+    { 0x03b9, 0x03b9, 0x0007, CanonicalizeSet },
+    { 0x03ba, 0x03ba, 0x0008, CanonicalizeSet },
+    { 0x03bb, 0x03bb, 0x0020, CanonicalizeRangeHi },
+    { 0x03bc, 0x03bc, 0x0009, CanonicalizeSet },
+    { 0x03bd, 0x03bf, 0x0020, CanonicalizeRangeHi },
+    { 0x03c0, 0x03c0, 0x000a, CanonicalizeSet },
+    { 0x03c1, 0x03c1, 0x000b, CanonicalizeSet },
+    { 0x03c2, 0x03c3, 0x000c, CanonicalizeSet },
+    { 0x03c4, 0x03c5, 0x0020, CanonicalizeRangeHi },
+    { 0x03c6, 0x03c6, 0x000d, CanonicalizeSet },
+    { 0x03c7, 0x03cb, 0x0020, CanonicalizeRangeHi },
+    { 0x03cc, 0x03cc, 0x0040, CanonicalizeRangeHi },
+    { 0x03cd, 0x03ce, 0x003f, CanonicalizeRangeHi },
+    { 0x03cf, 0x03cf, 0x0008, CanonicalizeRangeLo },
+    { 0x03d0, 0x03d0, 0x0004, CanonicalizeSet },
+    { 0x03d1, 0x03d1, 0x0006, CanonicalizeSet },
+    { 0x03d2, 0x03d4, 0x0000, CanonicalizeUnique },
+    { 0x03d5, 0x03d5, 0x000d, CanonicalizeSet },
+    { 0x03d6, 0x03d6, 0x000a, CanonicalizeSet },
+    { 0x03d7, 0x03d7, 0x0008, CanonicalizeRangeHi },
+    { 0x03d8, 0x03ef, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x03f0, 0x03f0, 0x0008, CanonicalizeSet },
+    { 0x03f1, 0x03f1, 0x000b, CanonicalizeSet },
+    { 0x03f2, 0x03f2, 0x0007, CanonicalizeRangeLo },
+    { 0x03f3, 0x03f3, 0x0074, CanonicalizeRangeHi },
+    { 0x03f4, 0x03f4, 0x0000, CanonicalizeUnique },
+    { 0x03f5, 0x03f5, 0x0005, CanonicalizeSet },
+    { 0x03f6, 0x03f6, 0x0000, CanonicalizeUnique },
+    { 0x03f7, 0x03f8, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x03f9, 0x03f9, 0x0007, CanonicalizeRangeHi },
+    { 0x03fa, 0x03fb, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x03fc, 0x03fc, 0x0000, CanonicalizeUnique },
+    { 0x03fd, 0x03ff, 0x0082, CanonicalizeRangeHi },
+    { 0x0400, 0x040f, 0x0050, CanonicalizeRangeLo },
+    { 0x0410, 0x042f, 0x0020, CanonicalizeRangeLo },
+    { 0x0430, 0x044f, 0x0020, CanonicalizeRangeHi },
+    { 0x0450, 0x045f, 0x0050, CanonicalizeRangeHi },
+    { 0x0460, 0x0481, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x0482, 0x0489, 0x0000, CanonicalizeUnique },
+    { 0x048a, 0x04bf, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x04c0, 0x04c0, 0x000f, CanonicalizeRangeLo },
+    { 0x04c1, 0x04ce, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x04cf, 0x04cf, 0x000f, CanonicalizeRangeHi },
+    { 0x04d0, 0x052f, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x0530, 0x0530, 0x0000, CanonicalizeUnique },
+    { 0x0531, 0x0556, 0x0030, CanonicalizeRangeLo },
+    { 0x0557, 0x0560, 0x0000, CanonicalizeUnique },
+    { 0x0561, 0x0586, 0x0030, CanonicalizeRangeHi },
+    { 0x0587, 0x109f, 0x0000, CanonicalizeUnique },
+    { 0x10a0, 0x10c5, 0x1c60, CanonicalizeRangeLo },
+    { 0x10c6, 0x10c6, 0x0000, CanonicalizeUnique },
+    { 0x10c7, 0x10c7, 0x1c60, CanonicalizeRangeLo },
+    { 0x10c8, 0x10cc, 0x0000, CanonicalizeUnique },
+    { 0x10cd, 0x10cd, 0x1c60, CanonicalizeRangeLo },
+    { 0x10ce, 0x1d78, 0x0000, CanonicalizeUnique },
+    { 0x1d79, 0x1d79, 0x8a04, CanonicalizeRangeLo },
+    { 0x1d7a, 0x1d7c, 0x0000, CanonicalizeUnique },
+    { 0x1d7d, 0x1d7d, 0x0ee6, CanonicalizeRangeLo },
+    { 0x1d7e, 0x1dff, 0x0000, CanonicalizeUnique },
+    { 0x1e00, 0x1e5f, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x1e60, 0x1e61, 0x000e, CanonicalizeSet },
+    { 0x1e62, 0x1e95, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x1e96, 0x1e9a, 0x0000, CanonicalizeUnique },
+    { 0x1e9b, 0x1e9b, 0x000e, CanonicalizeSet },
+    { 0x1e9c, 0x1e9f, 0x0000, CanonicalizeUnique },
+    { 0x1ea0, 0x1eff, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x1f00, 0x1f07, 0x0008, CanonicalizeRangeLo },
+    { 0x1f08, 0x1f0f, 0x0008, CanonicalizeRangeHi },
+    { 0x1f10, 0x1f15, 0x0008, CanonicalizeRangeLo },
+    { 0x1f16, 0x1f17, 0x0000, CanonicalizeUnique },
+    { 0x1f18, 0x1f1d, 0x0008, CanonicalizeRangeHi },
+    { 0x1f1e, 0x1f1f, 0x0000, CanonicalizeUnique },
+    { 0x1f20, 0x1f27, 0x0008, CanonicalizeRangeLo },
+    { 0x1f28, 0x1f2f, 0x0008, CanonicalizeRangeHi },
+    { 0x1f30, 0x1f37, 0x0008, CanonicalizeRangeLo },
+    { 0x1f38, 0x1f3f, 0x0008, CanonicalizeRangeHi },
+    { 0x1f40, 0x1f45, 0x0008, CanonicalizeRangeLo },
+    { 0x1f46, 0x1f47, 0x0000, CanonicalizeUnique },
+    { 0x1f48, 0x1f4d, 0x0008, CanonicalizeRangeHi },
+    { 0x1f4e, 0x1f50, 0x0000, CanonicalizeUnique },
+    { 0x1f51, 0x1f51, 0x0008, CanonicalizeRangeLo },
+    { 0x1f52, 0x1f52, 0x0000, CanonicalizeUnique },
+    { 0x1f53, 0x1f53, 0x0008, CanonicalizeRangeLo },
+    { 0x1f54, 0x1f54, 0x0000, CanonicalizeUnique },
+    { 0x1f55, 0x1f55, 0x0008, CanonicalizeRangeLo },
+    { 0x1f56, 0x1f56, 0x0000, CanonicalizeUnique },
+    { 0x1f57, 0x1f57, 0x0008, CanonicalizeRangeLo },
+    { 0x1f58, 0x1f58, 0x0000, CanonicalizeUnique },
+    { 0x1f59, 0x1f59, 0x0008, CanonicalizeRangeHi },
+    { 0x1f5a, 0x1f5a, 0x0000, CanonicalizeUnique },
+    { 0x1f5b, 0x1f5b, 0x0008, CanonicalizeRangeHi },
+    { 0x1f5c, 0x1f5c, 0x0000, CanonicalizeUnique },
+    { 0x1f5d, 0x1f5d, 0x0008, CanonicalizeRangeHi },
+    { 0x1f5e, 0x1f5e, 0x0000, CanonicalizeUnique },
+    { 0x1f5f, 0x1f5f, 0x0008, CanonicalizeRangeHi },
+    { 0x1f60, 0x1f67, 0x0008, CanonicalizeRangeLo },
+    { 0x1f68, 0x1f6f, 0x0008, CanonicalizeRangeHi },
+    { 0x1f70, 0x1f71, 0x004a, CanonicalizeRangeLo },
+    { 0x1f72, 0x1f75, 0x0056, CanonicalizeRangeLo },
+    { 0x1f76, 0x1f77, 0x0064, CanonicalizeRangeLo },
+    { 0x1f78, 0x1f79, 0x0080, CanonicalizeRangeLo },
+    { 0x1f7a, 0x1f7b, 0x0070, CanonicalizeRangeLo },
+    { 0x1f7c, 0x1f7d, 0x007e, CanonicalizeRangeLo },
+    { 0x1f7e, 0x1faf, 0x0000, CanonicalizeUnique },
+    { 0x1fb0, 0x1fb1, 0x0008, CanonicalizeRangeLo },
+    { 0x1fb2, 0x1fb7, 0x0000, CanonicalizeUnique },
+    { 0x1fb8, 0x1fb9, 0x0008, CanonicalizeRangeHi },
+    { 0x1fba, 0x1fbb, 0x004a, CanonicalizeRangeHi },
+    { 0x1fbc, 0x1fbd, 0x0000, CanonicalizeUnique },
+    { 0x1fbe, 0x1fbe, 0x0007, CanonicalizeSet },
+    { 0x1fbf, 0x1fc7, 0x0000, CanonicalizeUnique },
+    { 0x1fc8, 0x1fcb, 0x0056, CanonicalizeRangeHi },
+    { 0x1fcc, 0x1fcf, 0x0000, CanonicalizeUnique },
+    { 0x1fd0, 0x1fd1, 0x0008, CanonicalizeRangeLo },
+    { 0x1fd2, 0x1fd7, 0x0000, CanonicalizeUnique },
+    { 0x1fd8, 0x1fd9, 0x0008, CanonicalizeRangeHi },
+    { 0x1fda, 0x1fdb, 0x0064, CanonicalizeRangeHi },
+    { 0x1fdc, 0x1fdf, 0x0000, CanonicalizeUnique },
+    { 0x1fe0, 0x1fe1, 0x0008, CanonicalizeRangeLo },
+    { 0x1fe2, 0x1fe4, 0x0000, CanonicalizeUnique },
+    { 0x1fe5, 0x1fe5, 0x0007, CanonicalizeRangeLo },
+    { 0x1fe6, 0x1fe7, 0x0000, CanonicalizeUnique },
+    { 0x1fe8, 0x1fe9, 0x0008, CanonicalizeRangeHi },
+    { 0x1fea, 0x1feb, 0x0070, CanonicalizeRangeHi },
+    { 0x1fec, 0x1fec, 0x0007, CanonicalizeRangeHi },
+    { 0x1fed, 0x1ff7, 0x0000, CanonicalizeUnique },
+    { 0x1ff8, 0x1ff9, 0x0080, CanonicalizeRangeHi },
+    { 0x1ffa, 0x1ffb, 0x007e, CanonicalizeRangeHi },
+    { 0x1ffc, 0x2131, 0x0000, CanonicalizeUnique },
+    { 0x2132, 0x2132, 0x001c, CanonicalizeRangeLo },
+    { 0x2133, 0x214d, 0x0000, CanonicalizeUnique },
+    { 0x214e, 0x214e, 0x001c, CanonicalizeRangeHi },
+    { 0x214f, 0x215f, 0x0000, CanonicalizeUnique },
+    { 0x2160, 0x216f, 0x0010, CanonicalizeRangeLo },
+    { 0x2170, 0x217f, 0x0010, CanonicalizeRangeHi },
+    { 0x2180, 0x2182, 0x0000, CanonicalizeUnique },
+    { 0x2183, 0x2184, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x2185, 0x24b5, 0x0000, CanonicalizeUnique },
+    { 0x24b6, 0x24cf, 0x001a, CanonicalizeRangeLo },
+    { 0x24d0, 0x24e9, 0x001a, CanonicalizeRangeHi },
+    { 0x24ea, 0x2bff, 0x0000, CanonicalizeUnique },
+    { 0x2c00, 0x2c2e, 0x0030, CanonicalizeRangeLo },
+    { 0x2c2f, 0x2c2f, 0x0000, CanonicalizeUnique },
+    { 0x2c30, 0x2c5e, 0x0030, CanonicalizeRangeHi },
+    { 0x2c5f, 0x2c5f, 0x0000, CanonicalizeUnique },
+    { 0x2c60, 0x2c61, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x2c62, 0x2c62, 0x29f7, CanonicalizeRangeHi },
+    { 0x2c63, 0x2c63, 0x0ee6, CanonicalizeRangeHi },
+    { 0x2c64, 0x2c64, 0x29e7, CanonicalizeRangeHi },
+    { 0x2c65, 0x2c65, 0x2a2b, CanonicalizeRangeHi },
+    { 0x2c66, 0x2c66, 0x2a28, CanonicalizeRangeHi },
+    { 0x2c67, 0x2c6c, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x2c6d, 0x2c6d, 0x2a1c, CanonicalizeRangeHi },
+    { 0x2c6e, 0x2c6e, 0x29fd, CanonicalizeRangeHi },
+    { 0x2c6f, 0x2c6f, 0x2a1f, CanonicalizeRangeHi },
+    { 0x2c70, 0x2c70, 0x2a1e, CanonicalizeRangeHi },
+    { 0x2c71, 0x2c71, 0x0000, CanonicalizeUnique },
+    { 0x2c72, 0x2c73, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x2c74, 0x2c74, 0x0000, CanonicalizeUnique },
+    { 0x2c75, 0x2c76, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x2c77, 0x2c7d, 0x0000, CanonicalizeUnique },
+    { 0x2c7e, 0x2c7f, 0x2a3f, CanonicalizeRangeHi },
+    { 0x2c80, 0x2ce3, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x2ce4, 0x2cea, 0x0000, CanonicalizeUnique },
+    { 0x2ceb, 0x2cee, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x2cef, 0x2cf1, 0x0000, CanonicalizeUnique },
+    { 0x2cf2, 0x2cf3, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x2cf4, 0x2cff, 0x0000, CanonicalizeUnique },
+    { 0x2d00, 0x2d25, 0x1c60, CanonicalizeRangeHi },
+    { 0x2d26, 0x2d26, 0x0000, CanonicalizeUnique },
+    { 0x2d27, 0x2d27, 0x1c60, CanonicalizeRangeHi },
+    { 0x2d28, 0x2d2c, 0x0000, CanonicalizeUnique },
+    { 0x2d2d, 0x2d2d, 0x1c60, CanonicalizeRangeHi },
+    { 0x2d2e, 0xa63f, 0x0000, CanonicalizeUnique },
+    { 0xa640, 0xa66d, 0x0000, CanonicalizeAlternatingAligned },
+    { 0xa66e, 0xa67f, 0x0000, CanonicalizeUnique },
+    { 0xa680, 0xa69b, 0x0000, CanonicalizeAlternatingAligned },
+    { 0xa69c, 0xa721, 0x0000, CanonicalizeUnique },
+    { 0xa722, 0xa72f, 0x0000, CanonicalizeAlternatingAligned },
+    { 0xa730, 0xa731, 0x0000, CanonicalizeUnique },
+    { 0xa732, 0xa76f, 0x0000, CanonicalizeAlternatingAligned },
+    { 0xa770, 0xa778, 0x0000, CanonicalizeUnique },
+    { 0xa779, 0xa77c, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0xa77d, 0xa77d, 0x8a04, CanonicalizeRangeHi },
+    { 0xa77e, 0xa787, 0x0000, CanonicalizeAlternatingAligned },
+    { 0xa788, 0xa78a, 0x0000, CanonicalizeUnique },
+    { 0xa78b, 0xa78c, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0xa78d, 0xa78d, 0xa528, CanonicalizeRangeHi },
+    { 0xa78e, 0xa78f, 0x0000, CanonicalizeUnique },
+    { 0xa790, 0xa793, 0x0000, CanonicalizeAlternatingAligned },
+    { 0xa794, 0xa795, 0x0000, CanonicalizeUnique },
+    { 0xa796, 0xa7a9, 0x0000, CanonicalizeAlternatingAligned },
+    { 0xa7aa, 0xa7aa, 0xa544, CanonicalizeRangeHi },
+    { 0xa7ab, 0xa7ab, 0xa54f, CanonicalizeRangeHi },
+    { 0xa7ac, 0xa7ac, 0xa54b, CanonicalizeRangeHi },
+    { 0xa7ad, 0xa7ad, 0xa541, CanonicalizeRangeHi },
+    { 0xa7ae, 0xa7af, 0x0000, CanonicalizeUnique },
+    { 0xa7b0, 0xa7b0, 0xa512, CanonicalizeRangeHi },
+    { 0xa7b1, 0xa7b1, 0xa52a, CanonicalizeRangeHi },
+    { 0xa7b2, 0xff20, 0x0000, CanonicalizeUnique },
+    { 0xff21, 0xff3a, 0x0020, CanonicalizeRangeLo },
+    { 0xff3b, 0xff40, 0x0000, CanonicalizeUnique },
+    { 0xff41, 0xff5a, 0x0020, CanonicalizeRangeHi },
+    { 0xff5b, 0xffff, 0x0000, CanonicalizeUnique },
+};
+
+const UChar32 unicodeCharacterSet0[] = { 0x0041, 0x0061, 0x1e9a, 0 };
+const UChar32 unicodeCharacterSet1[] = { 0x0046, 0x0066, 0xfb00, 0xfb01, 0xfb02, 0xfb03, 0xfb04, 0 };
+const UChar32 unicodeCharacterSet2[] = { 0x0048, 0x0068, 0x1e96, 0 };
+const UChar32 unicodeCharacterSet3[] = { 0x0049, 0x0069, 0x0131, 0 };
+const UChar32 unicodeCharacterSet4[] = { 0x004a, 0x006a, 0x01f0, 0 };
+const UChar32 unicodeCharacterSet5[] = { 0x0053, 0x0073, 0x00df, 0x017f, 0xfb05, 0xfb06, 0 };
+const UChar32 unicodeCharacterSet6[] = { 0x0054, 0x0074, 0x1e97, 0 };
+const UChar32 unicodeCharacterSet7[] = { 0x0057, 0x0077, 0x1e98, 0 };
+const UChar32 unicodeCharacterSet8[] = { 0x0059, 0x0079, 0x1e99, 0 };
+const UChar32 unicodeCharacterSet9[] = { 0x01c4, 0x01c5, 0x01c6, 0 };
+const UChar32 unicodeCharacterSet10[] = { 0x01c7, 0x01c8, 0x01c9, 0 };
+const UChar32 unicodeCharacterSet11[] = { 0x01ca, 0x01cb, 0x01cc, 0 };
+const UChar32 unicodeCharacterSet12[] = { 0x01f1, 0x01f2, 0x01f3, 0 };
+const UChar32 unicodeCharacterSet13[] = { 0x0386, 0x03ac, 0x1fb4, 0 };
+const UChar32 unicodeCharacterSet14[] = { 0x0389, 0x03ae, 0x1fc4, 0 };
+const UChar32 unicodeCharacterSet15[] = { 0x038f, 0x03ce, 0x1ff4, 0 };
+const UChar32 unicodeCharacterSet16[] = { 0x0391, 0x03b1, 0x1fb3, 0x1fb6, 0x1fb7, 0x1fbc, 0 };
+const UChar32 unicodeCharacterSet17[] = { 0x0392, 0x03b2, 0x03d0, 0 };
+const UChar32 unicodeCharacterSet18[] = { 0x0395, 0x03b5, 0x03f5, 0 };
+const UChar32 unicodeCharacterSet19[] = { 0x0397, 0x03b7, 0x1fc3, 0x1fc6, 0x1fc7, 0x1fcc, 0 };
+const UChar32 unicodeCharacterSet20[] = { 0x0398, 0x03b8, 0x03d1, 0 };
+const UChar32 unicodeCharacterSet21[] = { 0x0345, 0x0390, 0x0399, 0x03b9, 0x1fbe, 0x1fd2, 0x1fd3, 0x1fd6, 0x1fd7, 0 };
+const UChar32 unicodeCharacterSet22[] = { 0x039a, 0x03ba, 0x03f0, 0 };
+const UChar32 unicodeCharacterSet23[] = { 0x00b5, 0x039c, 0x03bc, 0 };
+const UChar32 unicodeCharacterSet24[] = { 0x03a0, 0x03c0, 0x03d6, 0 };
+const UChar32 unicodeCharacterSet25[] = { 0x03a1, 0x03c1, 0x03f1, 0x1fe4, 0 };
+const UChar32 unicodeCharacterSet26[] = { 0x03a3, 0x03c2, 0x03c3, 0 };
+const UChar32 unicodeCharacterSet27[] = { 0x03a5, 0x03b0, 0x03c5, 0x1f50, 0x1f52, 0x1f54, 0x1f56, 0x1fe2, 0x1fe3, 0x1fe6, 0x1fe7, 0 };
+const UChar32 unicodeCharacterSet28[] = { 0x03a6, 0x03c6, 0x03d5, 0 };
+const UChar32 unicodeCharacterSet29[] = { 0x03a9, 0x03c9, 0x1ff3, 0x1ff6, 0x1ff7, 0x1ffc, 0 };
+const UChar32 unicodeCharacterSet30[] = { 0x0535, 0x0565, 0x0587, 0 };
+const UChar32 unicodeCharacterSet31[] = { 0x0544, 0x0574, 0xfb13, 0xfb14, 0xfb15, 0xfb17, 0 };
+const UChar32 unicodeCharacterSet32[] = { 0x054e, 0x057e, 0xfb16, 0 };
+const UChar32 unicodeCharacterSet33[] = { 0x1e60, 0x1e61, 0x1e9b, 0 };
+const UChar32 unicodeCharacterSet34[] = { 0x1f00, 0x1f08, 0x1f80, 0x1f88, 0 };
+const UChar32 unicodeCharacterSet35[] = { 0x1f01, 0x1f09, 0x1f81, 0x1f89, 0 };
+const UChar32 unicodeCharacterSet36[] = { 0x1f02, 0x1f0a, 0x1f82, 0x1f8a, 0 };
+const UChar32 unicodeCharacterSet37[] = { 0x1f03, 0x1f0b, 0x1f83, 0x1f8b, 0 };
+const UChar32 unicodeCharacterSet38[] = { 0x1f04, 0x1f0c, 0x1f84, 0x1f8c, 0 };
+const UChar32 unicodeCharacterSet39[] = { 0x1f05, 0x1f0d, 0x1f85, 0x1f8d, 0 };
+const UChar32 unicodeCharacterSet40[] = { 0x1f06, 0x1f0e, 0x1f86, 0x1f8e, 0 };
+const UChar32 unicodeCharacterSet41[] = { 0x1f07, 0x1f0f, 0x1f87, 0x1f8f, 0 };
+const UChar32 unicodeCharacterSet42[] = { 0x1f20, 0x1f28, 0x1f90, 0x1f98, 0 };
+const UChar32 unicodeCharacterSet43[] = { 0x1f21, 0x1f29, 0x1f91, 0x1f99, 0 };
+const UChar32 unicodeCharacterSet44[] = { 0x1f22, 0x1f2a, 0x1f92, 0x1f9a, 0 };
+const UChar32 unicodeCharacterSet45[] = { 0x1f23, 0x1f2b, 0x1f93, 0x1f9b, 0 };
+const UChar32 unicodeCharacterSet46[] = { 0x1f24, 0x1f2c, 0x1f94, 0x1f9c, 0 };
+const UChar32 unicodeCharacterSet47[] = { 0x1f25, 0x1f2d, 0x1f95, 0x1f9d, 0 };
+const UChar32 unicodeCharacterSet48[] = { 0x1f26, 0x1f2e, 0x1f96, 0x1f9e, 0 };
+const UChar32 unicodeCharacterSet49[] = { 0x1f27, 0x1f2f, 0x1f97, 0x1f9f, 0 };
+const UChar32 unicodeCharacterSet50[] = { 0x1f60, 0x1f68, 0x1fa0, 0x1fa8, 0 };
+const UChar32 unicodeCharacterSet51[] = { 0x1f61, 0x1f69, 0x1fa1, 0x1fa9, 0 };
+const UChar32 unicodeCharacterSet52[] = { 0x1f62, 0x1f6a, 0x1fa2, 0x1faa, 0 };
+const UChar32 unicodeCharacterSet53[] = { 0x1f63, 0x1f6b, 0x1fa3, 0x1fab, 0 };
+const UChar32 unicodeCharacterSet54[] = { 0x1f64, 0x1f6c, 0x1fa4, 0x1fac, 0 };
+const UChar32 unicodeCharacterSet55[] = { 0x1f65, 0x1f6d, 0x1fa5, 0x1fad, 0 };
+const UChar32 unicodeCharacterSet56[] = { 0x1f66, 0x1f6e, 0x1fa6, 0x1fae, 0 };
+const UChar32 unicodeCharacterSet57[] = { 0x1f67, 0x1f6f, 0x1fa7, 0x1faf, 0 };
+const UChar32 unicodeCharacterSet58[] = { 0x1f70, 0x1fb2, 0x1fba, 0 };
+const UChar32 unicodeCharacterSet59[] = { 0x1f74, 0x1fc2, 0x1fca, 0 };
+const UChar32 unicodeCharacterSet60[] = { 0x1f7c, 0x1ff2, 0x1ffa, 0 };
+
+static const size_t UNICODE_CANONICALIZATION_SETS = 61;
+const UChar32* const unicodeCharacterSetInfo[UNICODE_CANONICALIZATION_SETS] = {
+    unicodeCharacterSet0,
+    unicodeCharacterSet1,
+    unicodeCharacterSet2,
+    unicodeCharacterSet3,
+    unicodeCharacterSet4,
+    unicodeCharacterSet5,
+    unicodeCharacterSet6,
+    unicodeCharacterSet7,
+    unicodeCharacterSet8,
+    unicodeCharacterSet9,
+    unicodeCharacterSet10,
+    unicodeCharacterSet11,
+    unicodeCharacterSet12,
+    unicodeCharacterSet13,
+    unicodeCharacterSet14,
+    unicodeCharacterSet15,
+    unicodeCharacterSet16,
+    unicodeCharacterSet17,
+    unicodeCharacterSet18,
+    unicodeCharacterSet19,
+    unicodeCharacterSet20,
+    unicodeCharacterSet21,
+    unicodeCharacterSet22,
+    unicodeCharacterSet23,
+    unicodeCharacterSet24,
+    unicodeCharacterSet25,
+    unicodeCharacterSet26,
+    unicodeCharacterSet27,
+    unicodeCharacterSet28,
+    unicodeCharacterSet29,
+    unicodeCharacterSet30,
+    unicodeCharacterSet31,
+    unicodeCharacterSet32,
+    unicodeCharacterSet33,
+    unicodeCharacterSet34,
+    unicodeCharacterSet35,
+    unicodeCharacterSet36,
+    unicodeCharacterSet37,
+    unicodeCharacterSet38,
+    unicodeCharacterSet39,
+    unicodeCharacterSet40,
+    unicodeCharacterSet41,
+    unicodeCharacterSet42,
+    unicodeCharacterSet43,
+    unicodeCharacterSet44,
+    unicodeCharacterSet45,
+    unicodeCharacterSet46,
+    unicodeCharacterSet47,
+    unicodeCharacterSet48,
+    unicodeCharacterSet49,
+    unicodeCharacterSet50,
+    unicodeCharacterSet51,
+    unicodeCharacterSet52,
+    unicodeCharacterSet53,
+    unicodeCharacterSet54,
+    unicodeCharacterSet55,
+    unicodeCharacterSet56,
+    unicodeCharacterSet57,
+    unicodeCharacterSet58,
+    unicodeCharacterSet59,
+    unicodeCharacterSet60,
+};
+
+const size_t UNICODE_CANONICALIZATION_RANGES = 585;
+const CanonicalizationRange unicodeRangeInfo[UNICODE_CANONICALIZATION_RANGES] = {
+    { 0x0000, 0x0040, 0x0000, CanonicalizeUnique },
+    { 0x0041, 0x0041, 0x0000, CanonicalizeSet },
+    { 0x0042, 0x0045, 0x0020, CanonicalizeRangeLo },
+    { 0x0046, 0x0046, 0x0001, CanonicalizeSet },
+    { 0x0047, 0x0047, 0x0020, CanonicalizeRangeLo },
+    { 0x0048, 0x0048, 0x0002, CanonicalizeSet },
+    { 0x0049, 0x0049, 0x0003, CanonicalizeSet },
+    { 0x004a, 0x004a, 0x0004, CanonicalizeSet },
+    { 0x004b, 0x0052, 0x0020, CanonicalizeRangeLo },
+    { 0x0053, 0x0053, 0x0005, CanonicalizeSet },
+    { 0x0054, 0x0054, 0x0006, CanonicalizeSet },
+    { 0x0055, 0x0056, 0x0020, CanonicalizeRangeLo },
+    { 0x0057, 0x0057, 0x0007, CanonicalizeSet },
+    { 0x0058, 0x0058, 0x0020, CanonicalizeRangeLo },
+    { 0x0059, 0x0059, 0x0008, CanonicalizeSet },
+    { 0x005a, 0x005a, 0x0020, CanonicalizeRangeLo },
+    { 0x005b, 0x0060, 0x0000, CanonicalizeUnique },
+    { 0x0061, 0x0061, 0x0000, CanonicalizeSet },
+    { 0x0062, 0x0065, 0x0020, CanonicalizeRangeHi },
+    { 0x0066, 0x0066, 0x0001, CanonicalizeSet },
+    { 0x0067, 0x0067, 0x0020, CanonicalizeRangeHi },
+    { 0x0068, 0x0068, 0x0002, CanonicalizeSet },
+    { 0x0069, 0x0069, 0x0003, CanonicalizeSet },
+    { 0x006a, 0x006a, 0x0004, CanonicalizeSet },
+    { 0x006b, 0x0072, 0x0020, CanonicalizeRangeHi },
+    { 0x0073, 0x0073, 0x0005, CanonicalizeSet },
+    { 0x0074, 0x0074, 0x0006, CanonicalizeSet },
+    { 0x0075, 0x0076, 0x0020, CanonicalizeRangeHi },
+    { 0x0077, 0x0077, 0x0007, CanonicalizeSet },
+    { 0x0078, 0x0078, 0x0020, CanonicalizeRangeHi },
+    { 0x0079, 0x0079, 0x0008, CanonicalizeSet },
+    { 0x007a, 0x007a, 0x0020, CanonicalizeRangeHi },
+    { 0x007b, 0x00b4, 0x0000, CanonicalizeUnique },
+    { 0x00b5, 0x00b5, 0x0017, CanonicalizeSet },
+    { 0x00b6, 0x00bf, 0x0000, CanonicalizeUnique },
+    { 0x00c0, 0x00d6, 0x0020, CanonicalizeRangeLo },
+    { 0x00d7, 0x00d7, 0x0000, CanonicalizeUnique },
+    { 0x00d8, 0x00de, 0x0020, CanonicalizeRangeLo },
+    { 0x00df, 0x00df, 0x0005, CanonicalizeSet },
+    { 0x00e0, 0x00f6, 0x0020, CanonicalizeRangeHi },
+    { 0x00f7, 0x00f7, 0x0000, CanonicalizeUnique },
+    { 0x00f8, 0x00fe, 0x0020, CanonicalizeRangeHi },
+    { 0x00ff, 0x00ff, 0x0079, CanonicalizeRangeLo },
+    { 0x0100, 0x012f, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x0130, 0x0130, 0x0000, CanonicalizeUnique },
+    { 0x0131, 0x0131, 0x0003, CanonicalizeSet },
+    { 0x0132, 0x0137, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x0138, 0x0138, 0x0000, CanonicalizeUnique },
+    { 0x0139, 0x0148, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x0149, 0x0149, 0x0173, CanonicalizeRangeLo },
+    { 0x014a, 0x0177, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x0178, 0x0178, 0x0079, CanonicalizeRangeHi },
+    { 0x0179, 0x017e, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x017f, 0x017f, 0x0005, CanonicalizeSet },
+    { 0x0180, 0x0180, 0x00c3, CanonicalizeRangeLo },
+    { 0x0181, 0x0181, 0x00d2, CanonicalizeRangeLo },
+    { 0x0182, 0x0185, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x0186, 0x0186, 0x00ce, CanonicalizeRangeLo },
+    { 0x0187, 0x0188, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x0189, 0x018a, 0x00cd, CanonicalizeRangeLo },
+    { 0x018b, 0x018c, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x018d, 0x018d, 0x0000, CanonicalizeUnique },
+    { 0x018e, 0x018e, 0x004f, CanonicalizeRangeLo },
+    { 0x018f, 0x018f, 0x00ca, CanonicalizeRangeLo },
+    { 0x0190, 0x0190, 0x00cb, CanonicalizeRangeLo },
+    { 0x0191, 0x0192, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x0193, 0x0193, 0x00cd, CanonicalizeRangeLo },
+    { 0x0194, 0x0194, 0x00cf, CanonicalizeRangeLo },
+    { 0x0195, 0x0195, 0x0061, CanonicalizeRangeLo },
+    { 0x0196, 0x0196, 0x00d3, CanonicalizeRangeLo },
+    { 0x0197, 0x0197, 0x00d1, CanonicalizeRangeLo },
+    { 0x0198, 0x0199, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x019a, 0x019a, 0x00a3, CanonicalizeRangeLo },
+    { 0x019b, 0x019b, 0x0000, CanonicalizeUnique },
+    { 0x019c, 0x019c, 0x00d3, CanonicalizeRangeLo },
+    { 0x019d, 0x019d, 0x00d5, CanonicalizeRangeLo },
+    { 0x019e, 0x019e, 0x0082, CanonicalizeRangeLo },
+    { 0x019f, 0x019f, 0x00d6, CanonicalizeRangeLo },
+    { 0x01a0, 0x01a5, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x01a6, 0x01a6, 0x00da, CanonicalizeRangeLo },
+    { 0x01a7, 0x01a8, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x01a9, 0x01a9, 0x00da, CanonicalizeRangeLo },
+    { 0x01aa, 0x01ab, 0x0000, CanonicalizeUnique },
+    { 0x01ac, 0x01ad, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x01ae, 0x01ae, 0x00da, CanonicalizeRangeLo },
+    { 0x01af, 0x01b0, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x01b1, 0x01b2, 0x00d9, CanonicalizeRangeLo },
+    { 0x01b3, 0x01b6, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x01b7, 0x01b7, 0x00db, CanonicalizeRangeLo },
+    { 0x01b8, 0x01b9, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x01ba, 0x01bb, 0x0000, CanonicalizeUnique },
+    { 0x01bc, 0x01bd, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x01be, 0x01be, 0x0000, CanonicalizeUnique },
+    { 0x01bf, 0x01bf, 0x0038, CanonicalizeRangeLo },
+    { 0x01c0, 0x01c3, 0x0000, CanonicalizeUnique },
+    { 0x01c4, 0x01c6, 0x0009, CanonicalizeSet },
+    { 0x01c7, 0x01c9, 0x000a, CanonicalizeSet },
+    { 0x01ca, 0x01cc, 0x000b, CanonicalizeSet },
+    { 0x01cd, 0x01dc, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x01dd, 0x01dd, 0x004f, CanonicalizeRangeHi },
+    { 0x01de, 0x01ef, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x01f0, 0x01f0, 0x0004, CanonicalizeSet },
+    { 0x01f1, 0x01f3, 0x000c, CanonicalizeSet },
+    { 0x01f4, 0x01f5, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x01f6, 0x01f6, 0x0061, CanonicalizeRangeHi },
+    { 0x01f7, 0x01f7, 0x0038, CanonicalizeRangeHi },
+    { 0x01f8, 0x021f, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x0220, 0x0220, 0x0082, CanonicalizeRangeHi },
+    { 0x0221, 0x0221, 0x0000, CanonicalizeUnique },
+    { 0x0222, 0x0233, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x0234, 0x0239, 0x0000, CanonicalizeUnique },
+    { 0x023a, 0x023a, 0x2a2b, CanonicalizeRangeLo },
+    { 0x023b, 0x023c, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x023d, 0x023d, 0x00a3, CanonicalizeRangeHi },
+    { 0x023e, 0x023e, 0x2a28, CanonicalizeRangeLo },
+    { 0x023f, 0x0240, 0x2a3f, CanonicalizeRangeLo },
+    { 0x0241, 0x0242, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x0243, 0x0243, 0x00c3, CanonicalizeRangeHi },
+    { 0x0244, 0x0244, 0x0045, CanonicalizeRangeLo },
+    { 0x0245, 0x0245, 0x0047, CanonicalizeRangeLo },
+    { 0x0246, 0x024f, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x0250, 0x0250, 0x2a1f, CanonicalizeRangeLo },
+    { 0x0251, 0x0251, 0x2a1c, CanonicalizeRangeLo },
+    { 0x0252, 0x0252, 0x2a1e, CanonicalizeRangeLo },
+    { 0x0253, 0x0253, 0x00d2, CanonicalizeRangeHi },
+    { 0x0254, 0x0254, 0x00ce, CanonicalizeRangeHi },
+    { 0x0255, 0x0255, 0x0000, CanonicalizeUnique },
+    { 0x0256, 0x0257, 0x00cd, CanonicalizeRangeHi },
+    { 0x0258, 0x0258, 0x0000, CanonicalizeUnique },
+    { 0x0259, 0x0259, 0x00ca, CanonicalizeRangeHi },
+    { 0x025a, 0x025a, 0x0000, CanonicalizeUnique },
+    { 0x025b, 0x025b, 0x00cb, CanonicalizeRangeHi },
+    { 0x025c, 0x025c, 0xa54f, CanonicalizeRangeLo },
+    { 0x025d, 0x025f, 0x0000, CanonicalizeUnique },
+    { 0x0260, 0x0260, 0x00cd, CanonicalizeRangeHi },
+    { 0x0261, 0x0261, 0xa54b, CanonicalizeRangeLo },
+    { 0x0262, 0x0262, 0x0000, CanonicalizeUnique },
+    { 0x0263, 0x0263, 0x00cf, CanonicalizeRangeHi },
+    { 0x0264, 0x0264, 0x0000, CanonicalizeUnique },
+    { 0x0265, 0x0265, 0xa528, CanonicalizeRangeLo },
+    { 0x0266, 0x0266, 0xa544, CanonicalizeRangeLo },
+    { 0x0267, 0x0267, 0x0000, CanonicalizeUnique },
+    { 0x0268, 0x0268, 0x00d1, CanonicalizeRangeHi },
+    { 0x0269, 0x0269, 0x00d3, CanonicalizeRangeHi },
+    { 0x026a, 0x026a, 0x0000, CanonicalizeUnique },
+    { 0x026b, 0x026b, 0x29f7, CanonicalizeRangeLo },
+    { 0x026c, 0x026c, 0xa541, CanonicalizeRangeLo },
+    { 0x026d, 0x026e, 0x0000, CanonicalizeUnique },
+    { 0x026f, 0x026f, 0x00d3, CanonicalizeRangeHi },
+    { 0x0270, 0x0270, 0x0000, CanonicalizeUnique },
+    { 0x0271, 0x0271, 0x29fd, CanonicalizeRangeLo },
+    { 0x0272, 0x0272, 0x00d5, CanonicalizeRangeHi },
+    { 0x0273, 0x0274, 0x0000, CanonicalizeUnique },
+    { 0x0275, 0x0275, 0x00d6, CanonicalizeRangeHi },
+    { 0x0276, 0x027c, 0x0000, CanonicalizeUnique },
+    { 0x027d, 0x027d, 0x29e7, CanonicalizeRangeLo },
+    { 0x027e, 0x027f, 0x0000, CanonicalizeUnique },
+    { 0x0280, 0x0280, 0x00da, CanonicalizeRangeHi },
+    { 0x0281, 0x0282, 0x0000, CanonicalizeUnique },
+    { 0x0283, 0x0283, 0x00da, CanonicalizeRangeHi },
+    { 0x0284, 0x0286, 0x0000, CanonicalizeUnique },
+    { 0x0287, 0x0287, 0xa52a, CanonicalizeRangeLo },
+    { 0x0288, 0x0288, 0x00da, CanonicalizeRangeHi },
+    { 0x0289, 0x0289, 0x0045, CanonicalizeRangeHi },
+    { 0x028a, 0x028b, 0x00d9, CanonicalizeRangeHi },
+    { 0x028c, 0x028c, 0x0047, CanonicalizeRangeHi },
+    { 0x028d, 0x0291, 0x0000, CanonicalizeUnique },
+    { 0x0292, 0x0292, 0x00db, CanonicalizeRangeHi },
+    { 0x0293, 0x029d, 0x0000, CanonicalizeUnique },
+    { 0x029e, 0x029e, 0xa512, CanonicalizeRangeLo },
+    { 0x029f, 0x02bb, 0x0000, CanonicalizeUnique },
+    { 0x02bc, 0x02bc, 0x0173, CanonicalizeRangeHi },
+    { 0x02bd, 0x0344, 0x0000, CanonicalizeUnique },
+    { 0x0345, 0x0345, 0x0015, CanonicalizeSet },
+    { 0x0346, 0x036f, 0x0000, CanonicalizeUnique },
+    { 0x0370, 0x0373, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x0374, 0x0375, 0x0000, CanonicalizeUnique },
+    { 0x0376, 0x0377, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x0378, 0x037a, 0x0000, CanonicalizeUnique },
+    { 0x037b, 0x037d, 0x0082, CanonicalizeRangeLo },
+    { 0x037e, 0x037e, 0x0000, CanonicalizeUnique },
+    { 0x037f, 0x037f, 0x0074, CanonicalizeRangeLo },
+    { 0x0380, 0x0385, 0x0000, CanonicalizeUnique },
+    { 0x0386, 0x0386, 0x000d, CanonicalizeSet },
+    { 0x0387, 0x0387, 0x0000, CanonicalizeUnique },
+    { 0x0388, 0x0388, 0x0025, CanonicalizeRangeLo },
+    { 0x0389, 0x0389, 0x000e, CanonicalizeSet },
+    { 0x038a, 0x038a, 0x0025, CanonicalizeRangeLo },
+    { 0x038b, 0x038b, 0x0000, CanonicalizeUnique },
+    { 0x038c, 0x038c, 0x0040, CanonicalizeRangeLo },
+    { 0x038d, 0x038d, 0x0000, CanonicalizeUnique },
+    { 0x038e, 0x038e, 0x003f, CanonicalizeRangeLo },
+    { 0x038f, 0x038f, 0x000f, CanonicalizeSet },
+    { 0x0390, 0x0390, 0x0015, CanonicalizeSet },
+    { 0x0391, 0x0391, 0x0010, CanonicalizeSet },
+    { 0x0392, 0x0392, 0x0011, CanonicalizeSet },
+    { 0x0393, 0x0394, 0x0020, CanonicalizeRangeLo },
+    { 0x0395, 0x0395, 0x0012, CanonicalizeSet },
+    { 0x0396, 0x0396, 0x0020, CanonicalizeRangeLo },
+    { 0x0397, 0x0397, 0x0013, CanonicalizeSet },
+    { 0x0398, 0x0398, 0x0014, CanonicalizeSet },
+    { 0x0399, 0x0399, 0x0015, CanonicalizeSet },
+    { 0x039a, 0x039a, 0x0016, CanonicalizeSet },
+    { 0x039b, 0x039b, 0x0020, CanonicalizeRangeLo },
+    { 0x039c, 0x039c, 0x0017, CanonicalizeSet },
+    { 0x039d, 0x039f, 0x0020, CanonicalizeRangeLo },
+    { 0x03a0, 0x03a0, 0x0018, CanonicalizeSet },
+    { 0x03a1, 0x03a1, 0x0019, CanonicalizeSet },
+    { 0x03a2, 0x03a2, 0x0000, CanonicalizeUnique },
+    { 0x03a3, 0x03a3, 0x001a, CanonicalizeSet },
+    { 0x03a4, 0x03a4, 0x0020, CanonicalizeRangeLo },
+    { 0x03a5, 0x03a5, 0x001b, CanonicalizeSet },
+    { 0x03a6, 0x03a6, 0x001c, CanonicalizeSet },
+    { 0x03a7, 0x03a8, 0x0020, CanonicalizeRangeLo },
+    { 0x03a9, 0x03a9, 0x001d, CanonicalizeSet },
+    { 0x03aa, 0x03ab, 0x0020, CanonicalizeRangeLo },
+    { 0x03ac, 0x03ac, 0x000d, CanonicalizeSet },
+    { 0x03ad, 0x03ad, 0x0025, CanonicalizeRangeHi },
+    { 0x03ae, 0x03ae, 0x000e, CanonicalizeSet },
+    { 0x03af, 0x03af, 0x0025, CanonicalizeRangeHi },
+    { 0x03b0, 0x03b0, 0x001b, CanonicalizeSet },
+    { 0x03b1, 0x03b1, 0x0010, CanonicalizeSet },
+    { 0x03b2, 0x03b2, 0x0011, CanonicalizeSet },
+    { 0x03b3, 0x03b4, 0x0020, CanonicalizeRangeHi },
+    { 0x03b5, 0x03b5, 0x0012, CanonicalizeSet },
+    { 0x03b6, 0x03b6, 0x0020, CanonicalizeRangeHi },
+    { 0x03b7, 0x03b7, 0x0013, CanonicalizeSet },
+    { 0x03b8, 0x03b8, 0x0014, CanonicalizeSet },
+    { 0x03b9, 0x03b9, 0x0015, CanonicalizeSet },
+    { 0x03ba, 0x03ba, 0x0016, CanonicalizeSet },
+    { 0x03bb, 0x03bb, 0x0020, CanonicalizeRangeHi },
+    { 0x03bc, 0x03bc, 0x0017, CanonicalizeSet },
+    { 0x03bd, 0x03bf, 0x0020, CanonicalizeRangeHi },
+    { 0x03c0, 0x03c0, 0x0018, CanonicalizeSet },
+    { 0x03c1, 0x03c1, 0x0019, CanonicalizeSet },
+    { 0x03c2, 0x03c3, 0x001a, CanonicalizeSet },
+    { 0x03c4, 0x03c4, 0x0020, CanonicalizeRangeHi },
+    { 0x03c5, 0x03c5, 0x001b, CanonicalizeSet },
+    { 0x03c6, 0x03c6, 0x001c, CanonicalizeSet },
+    { 0x03c7, 0x03c8, 0x0020, CanonicalizeRangeHi },
+    { 0x03c9, 0x03c9, 0x001d, CanonicalizeSet },
+    { 0x03ca, 0x03cb, 0x0020, CanonicalizeRangeHi },
+    { 0x03cc, 0x03cc, 0x0040, CanonicalizeRangeHi },
+    { 0x03cd, 0x03cd, 0x003f, CanonicalizeRangeHi },
+    { 0x03ce, 0x03ce, 0x000f, CanonicalizeSet },
+    { 0x03cf, 0x03cf, 0x0008, CanonicalizeRangeLo },
+    { 0x03d0, 0x03d0, 0x0011, CanonicalizeSet },
+    { 0x03d1, 0x03d1, 0x0014, CanonicalizeSet },
+    { 0x03d2, 0x03d4, 0x0000, CanonicalizeUnique },
+    { 0x03d5, 0x03d5, 0x001c, CanonicalizeSet },
+    { 0x03d6, 0x03d6, 0x0018, CanonicalizeSet },
+    { 0x03d7, 0x03d7, 0x0008, CanonicalizeRangeHi },
+    { 0x03d8, 0x03ef, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x03f0, 0x03f0, 0x0016, CanonicalizeSet },
+    { 0x03f1, 0x03f1, 0x0019, CanonicalizeSet },
+    { 0x03f2, 0x03f2, 0x0007, CanonicalizeRangeLo },
+    { 0x03f3, 0x03f3, 0x0074, CanonicalizeRangeHi },
+    { 0x03f4, 0x03f4, 0x0000, CanonicalizeUnique },
+    { 0x03f5, 0x03f5, 0x0012, CanonicalizeSet },
+    { 0x03f6, 0x03f6, 0x0000, CanonicalizeUnique },
+    { 0x03f7, 0x03f8, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x03f9, 0x03f9, 0x0007, CanonicalizeRangeHi },
+    { 0x03fa, 0x03fb, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x03fc, 0x03fc, 0x0000, CanonicalizeUnique },
+    { 0x03fd, 0x03ff, 0x0082, CanonicalizeRangeHi },
+    { 0x0400, 0x040f, 0x0050, CanonicalizeRangeLo },
+    { 0x0410, 0x042f, 0x0020, CanonicalizeRangeLo },
+    { 0x0430, 0x044f, 0x0020, CanonicalizeRangeHi },
+    { 0x0450, 0x045f, 0x0050, CanonicalizeRangeHi },
+    { 0x0460, 0x0481, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x0482, 0x0489, 0x0000, CanonicalizeUnique },
+    { 0x048a, 0x04bf, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x04c0, 0x04c0, 0x000f, CanonicalizeRangeLo },
+    { 0x04c1, 0x04ce, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x04cf, 0x04cf, 0x000f, CanonicalizeRangeHi },
+    { 0x04d0, 0x052f, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x0530, 0x0530, 0x0000, CanonicalizeUnique },
+    { 0x0531, 0x0534, 0x0030, CanonicalizeRangeLo },
+    { 0x0535, 0x0535, 0x001e, CanonicalizeSet },
+    { 0x0536, 0x0543, 0x0030, CanonicalizeRangeLo },
+    { 0x0544, 0x0544, 0x001f, CanonicalizeSet },
+    { 0x0545, 0x054d, 0x0030, CanonicalizeRangeLo },
+    { 0x054e, 0x054e, 0x0020, CanonicalizeSet },
+    { 0x054f, 0x0556, 0x0030, CanonicalizeRangeLo },
+    { 0x0557, 0x0560, 0x0000, CanonicalizeUnique },
+    { 0x0561, 0x0564, 0x0030, CanonicalizeRangeHi },
+    { 0x0565, 0x0565, 0x001e, CanonicalizeSet },
+    { 0x0566, 0x0573, 0x0030, CanonicalizeRangeHi },
+    { 0x0574, 0x0574, 0x001f, CanonicalizeSet },
+    { 0x0575, 0x057d, 0x0030, CanonicalizeRangeHi },
+    { 0x057e, 0x057e, 0x0020, CanonicalizeSet },
+    { 0x057f, 0x0586, 0x0030, CanonicalizeRangeHi },
+    { 0x0587, 0x0587, 0x001e, CanonicalizeSet },
+    { 0x0588, 0x109f, 0x0000, CanonicalizeUnique },
+    { 0x10a0, 0x10c5, 0x1c60, CanonicalizeRangeLo },
+    { 0x10c6, 0x10c6, 0x0000, CanonicalizeUnique },
+    { 0x10c7, 0x10c7, 0x1c60, CanonicalizeRangeLo },
+    { 0x10c8, 0x10cc, 0x0000, CanonicalizeUnique },
+    { 0x10cd, 0x10cd, 0x1c60, CanonicalizeRangeLo },
+    { 0x10ce, 0x1d78, 0x0000, CanonicalizeUnique },
+    { 0x1d79, 0x1d79, 0x8a04, CanonicalizeRangeLo },
+    { 0x1d7a, 0x1d7c, 0x0000, CanonicalizeUnique },
+    { 0x1d7d, 0x1d7d, 0x0ee6, CanonicalizeRangeLo },
+    { 0x1d7e, 0x1dff, 0x0000, CanonicalizeUnique },
+    { 0x1e00, 0x1e5f, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x1e60, 0x1e61, 0x0021, CanonicalizeSet },
+    { 0x1e62, 0x1e95, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x1e96, 0x1e96, 0x0002, CanonicalizeSet },
+    { 0x1e97, 0x1e97, 0x0006, CanonicalizeSet },
+    { 0x1e98, 0x1e98, 0x0007, CanonicalizeSet },
+    { 0x1e99, 0x1e99, 0x0008, CanonicalizeSet },
+    { 0x1e9a, 0x1e9a, 0x0000, CanonicalizeSet },
+    { 0x1e9b, 0x1e9b, 0x0021, CanonicalizeSet },
+    { 0x1e9c, 0x1e9f, 0x0000, CanonicalizeUnique },
+    { 0x1ea0, 0x1eff, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x1f00, 0x1f00, 0x0022, CanonicalizeSet },
+    { 0x1f01, 0x1f01, 0x0023, CanonicalizeSet },
+    { 0x1f02, 0x1f02, 0x0024, CanonicalizeSet },
+    { 0x1f03, 0x1f03, 0x0025, CanonicalizeSet },
+    { 0x1f04, 0x1f04, 0x0026, CanonicalizeSet },
+    { 0x1f05, 0x1f05, 0x0027, CanonicalizeSet },
+    { 0x1f06, 0x1f06, 0x0028, CanonicalizeSet },
+    { 0x1f07, 0x1f07, 0x0029, CanonicalizeSet },
+    { 0x1f08, 0x1f08, 0x0022, CanonicalizeSet },
+    { 0x1f09, 0x1f09, 0x0023, CanonicalizeSet },
+    { 0x1f0a, 0x1f0a, 0x0024, CanonicalizeSet },
+    { 0x1f0b, 0x1f0b, 0x0025, CanonicalizeSet },
+    { 0x1f0c, 0x1f0c, 0x0026, CanonicalizeSet },
+    { 0x1f0d, 0x1f0d, 0x0027, CanonicalizeSet },
+    { 0x1f0e, 0x1f0e, 0x0028, CanonicalizeSet },
+    { 0x1f0f, 0x1f0f, 0x0029, CanonicalizeSet },
+    { 0x1f10, 0x1f15, 0x0008, CanonicalizeRangeLo },
+    { 0x1f16, 0x1f17, 0x0000, CanonicalizeUnique },
+    { 0x1f18, 0x1f1d, 0x0008, CanonicalizeRangeHi },
+    { 0x1f1e, 0x1f1f, 0x0000, CanonicalizeUnique },
+    { 0x1f20, 0x1f20, 0x002a, CanonicalizeSet },
+    { 0x1f21, 0x1f21, 0x002b, CanonicalizeSet },
+    { 0x1f22, 0x1f22, 0x002c, CanonicalizeSet },
+    { 0x1f23, 0x1f23, 0x002d, CanonicalizeSet },
+    { 0x1f24, 0x1f24, 0x002e, CanonicalizeSet },
+    { 0x1f25, 0x1f25, 0x002f, CanonicalizeSet },
+    { 0x1f26, 0x1f26, 0x0030, CanonicalizeSet },
+    { 0x1f27, 0x1f27, 0x0031, CanonicalizeSet },
+    { 0x1f28, 0x1f28, 0x002a, CanonicalizeSet },
+    { 0x1f29, 0x1f29, 0x002b, CanonicalizeSet },
+    { 0x1f2a, 0x1f2a, 0x002c, CanonicalizeSet },
+    { 0x1f2b, 0x1f2b, 0x002d, CanonicalizeSet },
+    { 0x1f2c, 0x1f2c, 0x002e, CanonicalizeSet },
+    { 0x1f2d, 0x1f2d, 0x002f, CanonicalizeSet },
+    { 0x1f2e, 0x1f2e, 0x0030, CanonicalizeSet },
+    { 0x1f2f, 0x1f2f, 0x0031, CanonicalizeSet },
+    { 0x1f30, 0x1f37, 0x0008, CanonicalizeRangeLo },
+    { 0x1f38, 0x1f3f, 0x0008, CanonicalizeRangeHi },
+    { 0x1f40, 0x1f45, 0x0008, CanonicalizeRangeLo },
+    { 0x1f46, 0x1f47, 0x0000, CanonicalizeUnique },
+    { 0x1f48, 0x1f4d, 0x0008, CanonicalizeRangeHi },
+    { 0x1f4e, 0x1f4f, 0x0000, CanonicalizeUnique },
+    { 0x1f50, 0x1f50, 0x001b, CanonicalizeSet },
+    { 0x1f51, 0x1f51, 0x0008, CanonicalizeRangeLo },
+    { 0x1f52, 0x1f52, 0x001b, CanonicalizeSet },
+    { 0x1f53, 0x1f53, 0x0008, CanonicalizeRangeLo },
+    { 0x1f54, 0x1f54, 0x001b, CanonicalizeSet },
+    { 0x1f55, 0x1f55, 0x0008, CanonicalizeRangeLo },
+    { 0x1f56, 0x1f56, 0x001b, CanonicalizeSet },
+    { 0x1f57, 0x1f57, 0x0008, CanonicalizeRangeLo },
+    { 0x1f58, 0x1f58, 0x0000, CanonicalizeUnique },
+    { 0x1f59, 0x1f59, 0x0008, CanonicalizeRangeHi },
+    { 0x1f5a, 0x1f5a, 0x0000, CanonicalizeUnique },
+    { 0x1f5b, 0x1f5b, 0x0008, CanonicalizeRangeHi },
+    { 0x1f5c, 0x1f5c, 0x0000, CanonicalizeUnique },
+    { 0x1f5d, 0x1f5d, 0x0008, CanonicalizeRangeHi },
+    { 0x1f5e, 0x1f5e, 0x0000, CanonicalizeUnique },
+    { 0x1f5f, 0x1f5f, 0x0008, CanonicalizeRangeHi },
+    { 0x1f60, 0x1f60, 0x0032, CanonicalizeSet },
+    { 0x1f61, 0x1f61, 0x0033, CanonicalizeSet },
+    { 0x1f62, 0x1f62, 0x0034, CanonicalizeSet },
+    { 0x1f63, 0x1f63, 0x0035, CanonicalizeSet },
+    { 0x1f64, 0x1f64, 0x0036, CanonicalizeSet },
+    { 0x1f65, 0x1f65, 0x0037, CanonicalizeSet },
+    { 0x1f66, 0x1f66, 0x0038, CanonicalizeSet },
+    { 0x1f67, 0x1f67, 0x0039, CanonicalizeSet },
+    { 0x1f68, 0x1f68, 0x0032, CanonicalizeSet },
+    { 0x1f69, 0x1f69, 0x0033, CanonicalizeSet },
+    { 0x1f6a, 0x1f6a, 0x0034, CanonicalizeSet },
+    { 0x1f6b, 0x1f6b, 0x0035, CanonicalizeSet },
+    { 0x1f6c, 0x1f6c, 0x0036, CanonicalizeSet },
+    { 0x1f6d, 0x1f6d, 0x0037, CanonicalizeSet },
+    { 0x1f6e, 0x1f6e, 0x0038, CanonicalizeSet },
+    { 0x1f6f, 0x1f6f, 0x0039, CanonicalizeSet },
+    { 0x1f70, 0x1f70, 0x003a, CanonicalizeSet },
+    { 0x1f71, 0x1f71, 0x004a, CanonicalizeRangeLo },
+    { 0x1f72, 0x1f73, 0x0056, CanonicalizeRangeLo },
+    { 0x1f74, 0x1f74, 0x003b, CanonicalizeSet },
+    { 0x1f75, 0x1f75, 0x0056, CanonicalizeRangeLo },
+    { 0x1f76, 0x1f77, 0x0064, CanonicalizeRangeLo },
+    { 0x1f78, 0x1f79, 0x0080, CanonicalizeRangeLo },
+    { 0x1f7a, 0x1f7b, 0x0070, CanonicalizeRangeLo },
+    { 0x1f7c, 0x1f7c, 0x003c, CanonicalizeSet },
+    { 0x1f7d, 0x1f7d, 0x007e, CanonicalizeRangeLo },
+    { 0x1f7e, 0x1f7f, 0x0000, CanonicalizeUnique },
+    { 0x1f80, 0x1f80, 0x0022, CanonicalizeSet },
+    { 0x1f81, 0x1f81, 0x0023, CanonicalizeSet },
+    { 0x1f82, 0x1f82, 0x0024, CanonicalizeSet },
+    { 0x1f83, 0x1f83, 0x0025, CanonicalizeSet },
+    { 0x1f84, 0x1f84, 0x0026, CanonicalizeSet },
+    { 0x1f85, 0x1f85, 0x0027, CanonicalizeSet },
+    { 0x1f86, 0x1f86, 0x0028, CanonicalizeSet },
+    { 0x1f87, 0x1f87, 0x0029, CanonicalizeSet },
+    { 0x1f88, 0x1f88, 0x0022, CanonicalizeSet },
+    { 0x1f89, 0x1f89, 0x0023, CanonicalizeSet },
+    { 0x1f8a, 0x1f8a, 0x0024, CanonicalizeSet },
+    { 0x1f8b, 0x1f8b, 0x0025, CanonicalizeSet },
+    { 0x1f8c, 0x1f8c, 0x0026, CanonicalizeSet },
+    { 0x1f8d, 0x1f8d, 0x0027, CanonicalizeSet },
+    { 0x1f8e, 0x1f8e, 0x0028, CanonicalizeSet },
+    { 0x1f8f, 0x1f8f, 0x0029, CanonicalizeSet },
+    { 0x1f90, 0x1f90, 0x002a, CanonicalizeSet },
+    { 0x1f91, 0x1f91, 0x002b, CanonicalizeSet },
+    { 0x1f92, 0x1f92, 0x002c, CanonicalizeSet },
+    { 0x1f93, 0x1f93, 0x002d, CanonicalizeSet },
+    { 0x1f94, 0x1f94, 0x002e, CanonicalizeSet },
+    { 0x1f95, 0x1f95, 0x002f, CanonicalizeSet },
+    { 0x1f96, 0x1f96, 0x0030, CanonicalizeSet },
+    { 0x1f97, 0x1f97, 0x0031, CanonicalizeSet },
+    { 0x1f98, 0x1f98, 0x002a, CanonicalizeSet },
+    { 0x1f99, 0x1f99, 0x002b, CanonicalizeSet },
+    { 0x1f9a, 0x1f9a, 0x002c, CanonicalizeSet },
+    { 0x1f9b, 0x1f9b, 0x002d, CanonicalizeSet },
+    { 0x1f9c, 0x1f9c, 0x002e, CanonicalizeSet },
+    { 0x1f9d, 0x1f9d, 0x002f, CanonicalizeSet },
+    { 0x1f9e, 0x1f9e, 0x0030, CanonicalizeSet },
+    { 0x1f9f, 0x1f9f, 0x0031, CanonicalizeSet },
+    { 0x1fa0, 0x1fa0, 0x0032, CanonicalizeSet },
+    { 0x1fa1, 0x1fa1, 0x0033, CanonicalizeSet },
+    { 0x1fa2, 0x1fa2, 0x0034, CanonicalizeSet },
+    { 0x1fa3, 0x1fa3, 0x0035, CanonicalizeSet },
+    { 0x1fa4, 0x1fa4, 0x0036, CanonicalizeSet },
+    { 0x1fa5, 0x1fa5, 0x0037, CanonicalizeSet },
+    { 0x1fa6, 0x1fa6, 0x0038, CanonicalizeSet },
+    { 0x1fa7, 0x1fa7, 0x0039, CanonicalizeSet },
+    { 0x1fa8, 0x1fa8, 0x0032, CanonicalizeSet },
+    { 0x1fa9, 0x1fa9, 0x0033, CanonicalizeSet },
+    { 0x1faa, 0x1faa, 0x0034, CanonicalizeSet },
+    { 0x1fab, 0x1fab, 0x0035, CanonicalizeSet },
+    { 0x1fac, 0x1fac, 0x0036, CanonicalizeSet },
+    { 0x1fad, 0x1fad, 0x0037, CanonicalizeSet },
+    { 0x1fae, 0x1fae, 0x0038, CanonicalizeSet },
+    { 0x1faf, 0x1faf, 0x0039, CanonicalizeSet },
+    { 0x1fb0, 0x1fb1, 0x0008, CanonicalizeRangeLo },
+    { 0x1fb2, 0x1fb2, 0x003a, CanonicalizeSet },
+    { 0x1fb3, 0x1fb3, 0x0010, CanonicalizeSet },
+    { 0x1fb4, 0x1fb4, 0x000d, CanonicalizeSet },
+    { 0x1fb5, 0x1fb5, 0x0000, CanonicalizeUnique },
+    { 0x1fb6, 0x1fb7, 0x0010, CanonicalizeSet },
+    { 0x1fb8, 0x1fb9, 0x0008, CanonicalizeRangeHi },
+    { 0x1fba, 0x1fba, 0x003a, CanonicalizeSet },
+    { 0x1fbb, 0x1fbb, 0x004a, CanonicalizeRangeHi },
+    { 0x1fbc, 0x1fbc, 0x0010, CanonicalizeSet },
+    { 0x1fbd, 0x1fbd, 0x0000, CanonicalizeUnique },
+    { 0x1fbe, 0x1fbe, 0x0015, CanonicalizeSet },
+    { 0x1fbf, 0x1fc1, 0x0000, CanonicalizeUnique },
+    { 0x1fc2, 0x1fc2, 0x003b, CanonicalizeSet },
+    { 0x1fc3, 0x1fc3, 0x0013, CanonicalizeSet },
+    { 0x1fc4, 0x1fc4, 0x000e, CanonicalizeSet },
+    { 0x1fc5, 0x1fc5, 0x0000, CanonicalizeUnique },
+    { 0x1fc6, 0x1fc7, 0x0013, CanonicalizeSet },
+    { 0x1fc8, 0x1fc9, 0x0056, CanonicalizeRangeHi },
+    { 0x1fca, 0x1fca, 0x003b, CanonicalizeSet },
+    { 0x1fcb, 0x1fcb, 0x0056, CanonicalizeRangeHi },
+    { 0x1fcc, 0x1fcc, 0x0013, CanonicalizeSet },
+    { 0x1fcd, 0x1fcf, 0x0000, CanonicalizeUnique },
+    { 0x1fd0, 0x1fd1, 0x0008, CanonicalizeRangeLo },
+    { 0x1fd2, 0x1fd3, 0x0015, CanonicalizeSet },
+    { 0x1fd4, 0x1fd5, 0x0000, CanonicalizeUnique },
+    { 0x1fd6, 0x1fd7, 0x0015, CanonicalizeSet },
+    { 0x1fd8, 0x1fd9, 0x0008, CanonicalizeRangeHi },
+    { 0x1fda, 0x1fdb, 0x0064, CanonicalizeRangeHi },
+    { 0x1fdc, 0x1fdf, 0x0000, CanonicalizeUnique },
+    { 0x1fe0, 0x1fe1, 0x0008, CanonicalizeRangeLo },
+    { 0x1fe2, 0x1fe3, 0x001b, CanonicalizeSet },
+    { 0x1fe4, 0x1fe4, 0x0019, CanonicalizeSet },
+    { 0x1fe5, 0x1fe5, 0x0007, CanonicalizeRangeLo },
+    { 0x1fe6, 0x1fe7, 0x001b, CanonicalizeSet },
+    { 0x1fe8, 0x1fe9, 0x0008, CanonicalizeRangeHi },
+    { 0x1fea, 0x1feb, 0x0070, CanonicalizeRangeHi },
+    { 0x1fec, 0x1fec, 0x0007, CanonicalizeRangeHi },
+    { 0x1fed, 0x1ff1, 0x0000, CanonicalizeUnique },
+    { 0x1ff2, 0x1ff2, 0x003c, CanonicalizeSet },
+    { 0x1ff3, 0x1ff3, 0x001d, CanonicalizeSet },
+    { 0x1ff4, 0x1ff4, 0x000f, CanonicalizeSet },
+    { 0x1ff5, 0x1ff5, 0x0000, CanonicalizeUnique },
+    { 0x1ff6, 0x1ff7, 0x001d, CanonicalizeSet },
+    { 0x1ff8, 0x1ff9, 0x0080, CanonicalizeRangeHi },
+    { 0x1ffa, 0x1ffa, 0x003c, CanonicalizeSet },
+    { 0x1ffb, 0x1ffb, 0x007e, CanonicalizeRangeHi },
+    { 0x1ffc, 0x1ffc, 0x001d, CanonicalizeSet },
+    { 0x1ffd, 0x2131, 0x0000, CanonicalizeUnique },
+    { 0x2132, 0x2132, 0x001c, CanonicalizeRangeLo },
+    { 0x2133, 0x214d, 0x0000, CanonicalizeUnique },
+    { 0x214e, 0x214e, 0x001c, CanonicalizeRangeHi },
+    { 0x214f, 0x215f, 0x0000, CanonicalizeUnique },
+    { 0x2160, 0x216f, 0x0010, CanonicalizeRangeLo },
+    { 0x2170, 0x217f, 0x0010, CanonicalizeRangeHi },
+    { 0x2180, 0x2182, 0x0000, CanonicalizeUnique },
+    { 0x2183, 0x2184, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x2185, 0x24b5, 0x0000, CanonicalizeUnique },
+    { 0x24b6, 0x24cf, 0x001a, CanonicalizeRangeLo },
+    { 0x24d0, 0x24e9, 0x001a, CanonicalizeRangeHi },
+    { 0x24ea, 0x2bff, 0x0000, CanonicalizeUnique },
+    { 0x2c00, 0x2c2e, 0x0030, CanonicalizeRangeLo },
+    { 0x2c2f, 0x2c2f, 0x0000, CanonicalizeUnique },
+    { 0x2c30, 0x2c5e, 0x0030, CanonicalizeRangeHi },
+    { 0x2c5f, 0x2c5f, 0x0000, CanonicalizeUnique },
+    { 0x2c60, 0x2c61, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x2c62, 0x2c62, 0x29f7, CanonicalizeRangeHi },
+    { 0x2c63, 0x2c63, 0x0ee6, CanonicalizeRangeHi },
+    { 0x2c64, 0x2c64, 0x29e7, CanonicalizeRangeHi },
+    { 0x2c65, 0x2c65, 0x2a2b, CanonicalizeRangeHi },
+    { 0x2c66, 0x2c66, 0x2a28, CanonicalizeRangeHi },
+    { 0x2c67, 0x2c6c, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x2c6d, 0x2c6d, 0x2a1c, CanonicalizeRangeHi },
+    { 0x2c6e, 0x2c6e, 0x29fd, CanonicalizeRangeHi },
+    { 0x2c6f, 0x2c6f, 0x2a1f, CanonicalizeRangeHi },
+    { 0x2c70, 0x2c70, 0x2a1e, CanonicalizeRangeHi },
+    { 0x2c71, 0x2c71, 0x0000, CanonicalizeUnique },
+    { 0x2c72, 0x2c73, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x2c74, 0x2c74, 0x0000, CanonicalizeUnique },
+    { 0x2c75, 0x2c76, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x2c77, 0x2c7d, 0x0000, CanonicalizeUnique },
+    { 0x2c7e, 0x2c7f, 0x2a3f, CanonicalizeRangeHi },
+    { 0x2c80, 0x2ce3, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x2ce4, 0x2cea, 0x0000, CanonicalizeUnique },
+    { 0x2ceb, 0x2cee, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0x2cef, 0x2cf1, 0x0000, CanonicalizeUnique },
+    { 0x2cf2, 0x2cf3, 0x0000, CanonicalizeAlternatingAligned },
+    { 0x2cf4, 0x2cff, 0x0000, CanonicalizeUnique },
+    { 0x2d00, 0x2d25, 0x1c60, CanonicalizeRangeHi },
+    { 0x2d26, 0x2d26, 0x0000, CanonicalizeUnique },
+    { 0x2d27, 0x2d27, 0x1c60, CanonicalizeRangeHi },
+    { 0x2d28, 0x2d2c, 0x0000, CanonicalizeUnique },
+    { 0x2d2d, 0x2d2d, 0x1c60, CanonicalizeRangeHi },
+    { 0x2d2e, 0xa63f, 0x0000, CanonicalizeUnique },
+    { 0xa640, 0xa66d, 0x0000, CanonicalizeAlternatingAligned },
+    { 0xa66e, 0xa67f, 0x0000, CanonicalizeUnique },
+    { 0xa680, 0xa69b, 0x0000, CanonicalizeAlternatingAligned },
+    { 0xa69c, 0xa721, 0x0000, CanonicalizeUnique },
+    { 0xa722, 0xa72f, 0x0000, CanonicalizeAlternatingAligned },
+    { 0xa730, 0xa731, 0x0000, CanonicalizeUnique },
+    { 0xa732, 0xa76f, 0x0000, CanonicalizeAlternatingAligned },
+    { 0xa770, 0xa778, 0x0000, CanonicalizeUnique },
+    { 0xa779, 0xa77c, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0xa77d, 0xa77d, 0x8a04, CanonicalizeRangeHi },
+    { 0xa77e, 0xa787, 0x0000, CanonicalizeAlternatingAligned },
+    { 0xa788, 0xa78a, 0x0000, CanonicalizeUnique },
+    { 0xa78b, 0xa78c, 0x0000, CanonicalizeAlternatingUnaligned },
+    { 0xa78d, 0xa78d, 0xa528, CanonicalizeRangeHi },
+    { 0xa78e, 0xa78f, 0x0000, CanonicalizeUnique },
+    { 0xa790, 0xa793, 0x0000, CanonicalizeAlternatingAligned },
+    { 0xa794, 0xa795, 0x0000, CanonicalizeUnique },
+    { 0xa796, 0xa7a9, 0x0000, CanonicalizeAlternatingAligned },
+    { 0xa7aa, 0xa7aa, 0xa544, CanonicalizeRangeHi },
+    { 0xa7ab, 0xa7ab, 0xa54f, CanonicalizeRangeHi },
+    { 0xa7ac, 0xa7ac, 0xa54b, CanonicalizeRangeHi },
+    { 0xa7ad, 0xa7ad, 0xa541, CanonicalizeRangeHi },
+    { 0xa7ae, 0xa7af, 0x0000, CanonicalizeUnique },
+    { 0xa7b0, 0xa7b0, 0xa512, CanonicalizeRangeHi },
+    { 0xa7b1, 0xa7b1, 0xa52a, CanonicalizeRangeHi },
+    { 0xa7b2, 0xfaff, 0x0000, CanonicalizeUnique },
+    { 0xfb00, 0xfb04, 0x0001, CanonicalizeSet },
+    { 0xfb05, 0xfb06, 0x0005, CanonicalizeSet },
+    { 0xfb07, 0xfb12, 0x0000, CanonicalizeUnique },
+    { 0xfb13, 0xfb15, 0x001f, CanonicalizeSet },
+    { 0xfb16, 0xfb16, 0x0020, CanonicalizeSet },
+    { 0xfb17, 0xfb17, 0x001f, CanonicalizeSet },
+    { 0xfb18, 0xff20, 0x0000, CanonicalizeUnique },
+    { 0xff21, 0xff3a, 0x0020, CanonicalizeRangeLo },
+    { 0xff3b, 0xff40, 0x0000, CanonicalizeUnique },
+    { 0xff41, 0xff5a, 0x0020, CanonicalizeRangeHi },
+    { 0xff5b, 0x103ff, 0x0000, CanonicalizeUnique },
+    { 0x10400, 0x10427, 0x0028, CanonicalizeRangeLo },
+    { 0x10428, 0x1044f, 0x0028, CanonicalizeRangeHi },
+    { 0x10450, 0x1189f, 0x0000, CanonicalizeUnique },
+    { 0x118a0, 0x118bf, 0x0020, CanonicalizeRangeLo },
+    { 0x118c0, 0x118df, 0x0020, CanonicalizeRangeHi },
+    { 0x118e0, 0x10ffff, 0x0000, CanonicalizeUnique },
+};
+
+} } // JSC::Yarr
+
</ins></span></pre></div>
<a id="trunkSourceJavaScriptCoreyarrYarrCanonicalizeUnicodehfromrev197165trunkSourceJavaScriptCoreyarrYarrCanonicalizeUCS2h"></a>
<div class="copfile"><h4>Copied: trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.h (from rev 197165, trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUCS2.h) (0 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.h                                (rev 0)
+++ trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.h        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -0,0 +1,144 @@
</span><ins>+/*
+ * Copyright (C) 2012-2016 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL APPLE INC. OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+ * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
+ * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 
+ */
+
+#ifndef YarrCanonicalizeUnicode_h
+#define YarrCanonicalizeUnicode_h
+
+#include &lt;stdint.h&gt;
+#include &lt;unicode/utypes.h&gt;
+
+namespace JSC { namespace Yarr {
+
+// This set of data (autogenerated using YarrCanonicalizeUnicode.js into YarrCanonicalizeUnicode.cpp)
+// provides information for each UCS2 code point as to the set of code points that it should
+// match under the ES5.1 case insensitive RegExp matching rules, specified in 15.10.2.8.
+enum UCS2CanonicalizationType {
+    CanonicalizeUnique,               // No canonically equal values, e.g. 0x0.
+    CanonicalizeSet,                  // Value indicates a set in characterSetInfo.
+    CanonicalizeRangeLo,              // Value is positive delta to pair, E.g. 0x41 has value 0x20, -&gt; 0x61.
+    CanonicalizeRangeHi,              // Value is positive delta to pair, E.g. 0x61 has value 0x20, -&gt; 0x41.
+    CanonicalizeAlternatingAligned,   // Aligned consequtive pair, e.g. 0x1f4,0x1f5.
+    CanonicalizeAlternatingUnaligned, // Unaligned consequtive pair, e.g. 0x241,0x242.
+};
+struct CanonicalizationRange {
+    UChar32 begin;
+    UChar32 end;
+    UChar32 value;
+    UCS2CanonicalizationType type;
+};
+
+extern const size_t UCS2_CANONICALIZATION_RANGES;
+extern const UChar32* const ucs2CharacterSetInfo[];
+extern const CanonicalizationRange ucs2RangeInfo[];
+
+extern const size_t UNICODE_CANONICALIZATION_RANGES;
+extern const UChar32* const unicodeCharacterSetInfo[];
+extern const CanonicalizationRange unicodeRangeInfo[];
+
+enum class CanonicalMode { UCS2, Unicode };
+
+inline const UChar32* canonicalCharacterSetInfo(unsigned index, CanonicalMode canonicalMode)
+{
+    const UChar32* const* rangeInfo = canonicalMode == CanonicalMode::UCS2 ? ucs2CharacterSetInfo : unicodeCharacterSetInfo;
+    return rangeInfo[index];
+}
+
+// This searches in log2 time over ~400-600 entries, so should typically result in 9 compares.
+inline const CanonicalizationRange* canonicalRangeInfoFor(UChar32 ch, CanonicalMode canonicalMode = CanonicalMode::UCS2)
+{
+    const CanonicalizationRange* info = canonicalMode == CanonicalMode::UCS2 ? ucs2RangeInfo : unicodeRangeInfo;
+    size_t entries = canonicalMode == CanonicalMode::UCS2 ? UCS2_CANONICALIZATION_RANGES : UNICODE_CANONICALIZATION_RANGES;
+
+    while (true) {
+        size_t candidate = entries &gt;&gt; 1;
+        const CanonicalizationRange* candidateInfo = info + candidate;
+        if (ch &lt; candidateInfo-&gt;begin)
+            entries = candidate;
+        else if (ch &lt;= candidateInfo-&gt;end)
+            return candidateInfo;
+        else {
+            info = candidateInfo + 1;
+            entries -= (candidate + 1);
+        }
+    }
+}
+
+// Should only be called for characters that have one canonically matching value.
+inline UChar32 getCanonicalPair(const CanonicalizationRange* info, UChar32 ch)
+{
+    ASSERT(ch &gt;= info-&gt;begin &amp;&amp; ch &lt;= info-&gt;end);
+    switch (info-&gt;type) {
+    case CanonicalizeRangeLo:
+        return ch + info-&gt;value;
+    case CanonicalizeRangeHi:
+        return ch - info-&gt;value;
+    case CanonicalizeAlternatingAligned:
+        return ch ^ 1;
+    case CanonicalizeAlternatingUnaligned:
+        return ((ch - 1) ^ 1) + 1;
+    default:
+        RELEASE_ASSERT_NOT_REACHED();
+    }
+    RELEASE_ASSERT_NOT_REACHED();
+    return 0;
+}
+
+// Returns true if no other UCS2 codepoint can match this value.
+inline bool isCanonicallyUnique(UChar32 ch, CanonicalMode canonicalMode = CanonicalMode::UCS2)
+{
+    return canonicalRangeInfoFor(ch, canonicalMode)-&gt;type == CanonicalizeUnique;
+}
+
+// Returns true if values are equal, under the canonicalization rules.
+inline bool areCanonicallyEquivalent(UChar32 a, UChar32 b, CanonicalMode canonicalMode = CanonicalMode::UCS2)
+{
+    const CanonicalizationRange* info = canonicalRangeInfoFor(a, canonicalMode);
+    switch (info-&gt;type) {
+    case CanonicalizeUnique:
+        return a == b;
+    case CanonicalizeSet: {
+        for (const UChar32* set = canonicalCharacterSetInfo(info-&gt;value, canonicalMode); (a = *set); ++set) {
+            if (a == b)
+                return true;
+        }
+        return false;
+    }
+    case CanonicalizeRangeLo:
+        return (a == b) || (a + info-&gt;value == b);
+    case CanonicalizeRangeHi:
+        return (a == b) || (a - info-&gt;value == b);
+    case CanonicalizeAlternatingAligned:
+        return (a | 1) == (b | 1);
+    case CanonicalizeAlternatingUnaligned:
+        return ((a - 1) | 1) == ((b - 1) | 1);
+    }
+
+    RELEASE_ASSERT_NOT_REACHED();
+    return false;
+}
+
+} } // JSC::Yarr
+
+#endif
</ins></span></pre></div>
<a id="trunkSourceJavaScriptCoreyarrYarrCanonicalizeUnicodejsfromrev197165trunkSourceJavaScriptCoreyarrYarrCanonicalizeUCS2js"></a>
<div class="copfile"><h4>Copied: trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.js (from rev 197165, trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUCS2.js) (0 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.js                                (rev 0)
+++ trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.js        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -0,0 +1,221 @@
</span><ins>+/*
+ * Copyright (C) 2012, 2016 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL APPLE INC. OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+ * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
+ * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 
+ */
+
+function printHeader()
+{
+    var copyright = (
+                     &quot;/*&quot;                                                                            + &quot;\n&quot; +
+                     &quot; * Copyright (C) 2012-2013, 2015-2016 Apple Inc. All rights reserved.&quot;         + &quot;\n&quot; +
+                     &quot; *&quot;                                                                            + &quot;\n&quot; +
+                     &quot; * Redistribution and use in source and binary forms, with or without&quot;         + &quot;\n&quot; +
+                     &quot; * modification, are permitted provided that the following conditions&quot;         + &quot;\n&quot; +
+                     &quot; * are met:&quot;                                                                   + &quot;\n&quot; +
+                     &quot; * 1. Redistributions of source code must retain the above copyright&quot;          + &quot;\n&quot; +
+                     &quot; *    notice, this list of conditions and the following disclaimer.&quot;           + &quot;\n&quot; +
+                     &quot; * 2. Redistributions in binary form must reproduce the above copyright&quot;       + &quot;\n&quot; +
+                     &quot; *    notice, this list of conditions and the following disclaimer in the&quot;     + &quot;\n&quot; +
+                     &quot; *    documentation and/or other materials provided with the distribution.&quot;    + &quot;\n&quot; +
+                     &quot; *&quot;                                                                            + &quot;\n&quot; +
+                     &quot; * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY&quot;                  + &quot;\n&quot; +
+                     &quot; * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE&quot;          + &quot;\n&quot; +
+                     &quot; * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR&quot;         + &quot;\n&quot; +
+                     &quot; * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL APPLE INC. OR&quot;                   + &quot;\n&quot; +
+                     &quot; * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,&quot;      + &quot;\n&quot; +
+                     &quot; * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,&quot;        + &quot;\n&quot; +
+                     &quot; * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR&quot;         + &quot;\n&quot; +
+                     &quot; * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY&quot;        + &quot;\n&quot; +
+                     &quot; * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT&quot;               + &quot;\n&quot; +
+                     &quot; * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE&quot;      + &quot;\n&quot; +
+                     &quot; * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. &quot;      + &quot;\n&quot; +
+                     &quot; */&quot;);
+    
+    print(copyright);
+    print();
+    print(&quot;// DO NOT EDIT! - this file autogenerated by YarrCanonicalizeUnicode.js&quot;);
+    print();
+    print('#include &quot;config.h&quot;');
+    print('#include &quot;YarrCanonicalizeUnicode.h&quot;');
+    print();
+    print(&quot;namespace JSC { namespace Yarr {&quot;);
+    print();
+    print(&quot;#include &lt;stdint.h&gt;&quot;);
+    print();
+}
+
+function printFooter()
+{
+    print(&quot;} } // JSC::Yarr&quot;);
+    print();
+}
+
+// Helper function to convert a number to a fixed width hex representation of a UChar32.
+function hex(x)
+{
+    var s = Number(x).toString(16);
+    while (s.length &lt; 4)
+        s = 0 + s;
+    return &quot;0x&quot; + s;
+}
+
+// See ES 6.0, 21.2.2.8.2 Steps 3
+function canonicalize(ch)
+{
+    var u = String.fromCharCode(ch).toUpperCase();
+    if (u.length &gt; 1)
+        return ch;
+    var cu = u.charCodeAt(0);
+    if (ch &gt;= 128 &amp;&amp; cu &lt; 128)
+        return ch;
+    return cu;
+}
+
+// See ES 6.0, 21.2.2.8.2 Step 2
+function canonicalizeUnicode(ch)
+{
+    if (ch &lt; 128)
+        return canonicalize(ch);
+
+    return String.fromCodePoint(ch).toUpperCase().codePointAt(0);
+}
+
+var MAX_UCS2 = 0xFFFF;
+var MAX_UNICODE = 0x10FFFF;
+
+function createUCS2CanonicalGroups()
+{
+    var groupedCanonically = [];
+    // Pass 1: populate groupedCanonically - this is mapping from canonicalized
+    // values back to the set of character code that canonicalize to them.
+    for (var i = 0; i &lt;= MAX_UCS2; ++i) {
+        var ch = canonicalize(i);
+        if (!groupedCanonically[ch])
+            groupedCanonically[ch] = [];
+        groupedCanonically[ch].push(i);
+    }
+
+    return groupedCanonically;
+}
+
+function createUnicodeCanonicalGroups()
+{
+    var groupedCanonically = [];
+    // Pass 1: populate groupedCanonically - this is mapping from canonicalized
+    // values back to the set of character code that canonicalize to them.
+    for (var i = 0; i &lt;= MAX_UNICODE; ++i) {
+        var ch = canonicalizeUnicode(i);
+        if (!groupedCanonically[ch])
+            groupedCanonically[ch] = [];
+        groupedCanonically[ch].push(i);
+    }
+
+    return groupedCanonically;
+}
+
+function createTables(prefix, maxValue, canonicalGroups)
+{
+    var prefixLower = prefix.toLowerCase();
+    var prefixUpper = prefix.toUpperCase();
+    var typeInfo = [];
+    var characterSetInfo = [];
+    // Pass 2: populate typeInfo &amp; characterSetInfo. For every character calculate
+    // a typeInfo value, described by the types above, and a value payload.
+    for (cu in canonicalGroups) {
+        // The set of characters that canonicalize to cu
+        var characters = canonicalGroups[cu];
+
+        // If there is only one, it is unique.
+        if (characters.length == 1) {
+            typeInfo[characters[0]] = &quot;CanonicalizeUnique:0&quot;;
+            continue;
+        }
+
+        // Sort the array.
+        characters.sort(function(x,y){return x-y;});
+
+        // If there are more than two characters, create an entry in characterSetInfo.
+        if (characters.length &gt; 2) {
+            for (i in characters)
+                typeInfo[characters[i]] = &quot;CanonicalizeSet:&quot; + characterSetInfo.length;
+            characterSetInfo.push(characters);
+
+            continue;
+        }
+
+        // We have a pair, mark alternating ranges, otherwise track whether this is the low or high partner.
+        var lo = characters[0];
+        var hi = characters[1];
+        var delta = hi - lo;
+        if (delta == 1) {
+            var type = lo &amp; 1 ? &quot;CanonicalizeAlternatingUnaligned:0&quot; : &quot;CanonicalizeAlternatingAligned:0&quot;;
+            typeInfo[lo] = type;
+            typeInfo[hi] = type;
+        } else {
+            typeInfo[lo] = &quot;CanonicalizeRangeLo:&quot; + delta;
+            typeInfo[hi] = &quot;CanonicalizeRangeHi:&quot; + delta;
+        }
+    }
+
+    var rangeInfo = [];
+    // Pass 3: coallesce types into ranges.
+    for (var end = 0; end &lt;= maxValue; ++end) {
+        var begin = end;
+        var type = typeInfo[end];
+        while (end &lt; maxValue &amp;&amp; typeInfo[end + 1] == type)
+            ++end;
+        rangeInfo.push({begin:begin, end:end, type:type});
+    }
+
+    for (i in characterSetInfo) {
+        var characters = &quot;&quot;
+        var set = characterSetInfo[i];
+        for (var j in set)
+            characters += hex(set[j]) + &quot;, &quot;;
+        print(&quot;const UChar32 &quot; + prefixLower + &quot;CharacterSet&quot; + i + &quot;[] = { &quot; + characters + &quot;0 };&quot;);
+    }
+    print();
+    print(&quot;static const size_t &quot; + prefixUpper + &quot;_CANONICALIZATION_SETS = &quot; + characterSetInfo.length + &quot;;&quot;);
+    print(&quot;const UChar32* const &quot; + prefixLower + &quot;CharacterSetInfo[&quot; + prefixUpper + &quot;_CANONICALIZATION_SETS] = {&quot;);
+    for (i in characterSetInfo)
+    print(&quot;    &quot; + prefixLower + &quot;CharacterSet&quot; + i + &quot;,&quot;);
+    print(&quot;};&quot;);
+    print();
+    print(&quot;const size_t &quot; + prefixUpper + &quot;_CANONICALIZATION_RANGES = &quot; + rangeInfo.length + &quot;;&quot;);
+    print(&quot;const CanonicalizationRange &quot; + prefixLower + &quot;RangeInfo[&quot; + prefixUpper + &quot;_CANONICALIZATION_RANGES] = {&quot;);
+    for (i in rangeInfo) {
+        var info = rangeInfo[i];
+        var typeAndValue = info.type.split(':');
+        print(&quot;    { &quot; + hex(info.begin) + &quot;, &quot; + hex(info.end) + &quot;, &quot; + hex(typeAndValue[1]) + &quot;, &quot; + typeAndValue[0] + &quot; },&quot;);
+    }
+    print(&quot;};&quot;);
+    print();
+}
+
+printHeader();
+
+createTables(&quot;UCS2&quot;, MAX_UCS2, createUCS2CanonicalGroups());
+createTables(&quot;Unicode&quot;, MAX_UNICODE, createUnicodeCanonicalGroups());
+
+printFooter();
+
</ins></span></pre></div>
<a id="trunkSourceJavaScriptCoreyarrYarrInterpretercpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/yarr/YarrInterpreter.cpp (197425 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/yarr/YarrInterpreter.cpp        2016-03-02 00:31:47 UTC (rev 197425)
+++ trunk/Source/JavaScriptCore/yarr/YarrInterpreter.cpp        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -1,5 +1,5 @@
</span><span class="cx"> /*
</span><del>- * Copyright (C) 2009 Apple Inc. All rights reserved.
</del><ins>+ * Copyright (C) 2009, 2013, 2016 Apple Inc. All rights reserved.
</ins><span class="cx">  * Copyright (C) 2010 Peter Varga (pvarga@inf.u-szeged.hu), University of Szeged
</span><span class="cx">  *
</span><span class="cx">  * Redistribution and use in source and binary forms, with or without
</span><span class="lines">@@ -28,7 +28,7 @@
</span><span class="cx"> #include &quot;YarrInterpreter.h&quot;
</span><span class="cx"> 
</span><span class="cx"> #include &quot;Yarr.h&quot;
</span><del>-#include &quot;YarrCanonicalizeUCS2.h&quot;
</del><ins>+#include &quot;YarrCanonicalizeUnicode.h&quot;
</ins><span class="cx"> #include &lt;wtf/BumpPointerAllocator.h&gt;
</span><span class="cx"> #include &lt;wtf/DataLog.h&gt;
</span><span class="cx"> #include &lt;wtf/text/CString.h&gt;
</span><span class="lines">@@ -44,9 +44,11 @@
</span><span class="cx">     struct ParenthesesDisjunctionContext;
</span><span class="cx"> 
</span><span class="cx">     struct BackTrackInfoPatternCharacter {
</span><ins>+        uintptr_t begin; // Only needed for unicode patterns
</ins><span class="cx">         uintptr_t matchAmount;
</span><span class="cx">     };
</span><span class="cx">     struct BackTrackInfoCharacterClass {
</span><ins>+        uintptr_t begin; // Only needed for unicode patterns
</ins><span class="cx">         uintptr_t matchAmount;
</span><span class="cx">     };
</span><span class="cx">     struct BackTrackInfoBackReference {
</span><span class="lines">@@ -167,10 +169,11 @@
</span><span class="cx"> 
</span><span class="cx">     class InputStream {
</span><span class="cx">     public:
</span><del>-        InputStream(const CharType* input, unsigned start, unsigned length)
</del><ins>+        InputStream(const CharType* input, unsigned start, unsigned length, bool decodeSurrogatePairs)
</ins><span class="cx">             : input(input)
</span><span class="cx">             , pos(start)
</span><span class="cx">             , length(length)
</span><ins>+            , decodeSurrogatePairs(decodeSurrogatePairs)
</ins><span class="cx">         {
</span><span class="cx">         }
</span><span class="cx"> 
</span><span class="lines">@@ -204,13 +207,43 @@
</span><span class="cx">             RELEASE_ASSERT(pos &gt;= negativePositionOffest);
</span><span class="cx">             unsigned p = pos - negativePositionOffest;
</span><span class="cx">             ASSERT(p &lt; length);
</span><del>-            return input[p];
</del><ins>+            int result = input[p];
+            if (U16_IS_LEAD(result) &amp;&amp; decodeSurrogatePairs &amp;&amp; p + 1 &lt; length
+                &amp;&amp; U16_IS_TRAIL(input[p + 1])) {
+                if (atEnd())
+                    return -1;
+                
+                result = U16_GET_SUPPLEMENTARY(result, input[p + 1]);
+                next();
+            }
+            return result;
</ins><span class="cx">         }
</span><ins>+        
+        int readSurrogatePairChecked(unsigned negativePositionOffest)
+        {
+            RELEASE_ASSERT(pos &gt;= negativePositionOffest);
+            unsigned p = pos - negativePositionOffest;
+            ASSERT(p &lt; length);
+            if (p + 1 &gt;= length)
+                return -1;
</ins><span class="cx"> 
</span><ins>+            int first = input[p];
+            if (U16_IS_LEAD(first) &amp;&amp; U16_IS_TRAIL(input[p + 1]))
+                return U16_GET_SUPPLEMENTARY(first, input[p + 1]);
+
+            return -1;
+        }
+
</ins><span class="cx">         int reread(unsigned from)
</span><span class="cx">         {
</span><span class="cx">             ASSERT(from &lt; length);
</span><del>-            return input[from];
</del><ins>+            int result = input[from];
+            if (U16_IS_LEAD(result) &amp;&amp; decodeSurrogatePairs &amp;&amp; from + 1 &lt; length
+                &amp;&amp; U16_IS_TRAIL(input[from + 1])) {
+                
+                result = U16_GET_SUPPLEMENTARY(result, input[from + 1]);
+            }
+            return result;
</ins><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         int prev()
</span><span class="lines">@@ -281,11 +314,12 @@
</span><span class="cx">         const CharType* input;
</span><span class="cx">         unsigned pos;
</span><span class="cx">         unsigned length;
</span><ins>+        bool decodeSurrogatePairs;
</ins><span class="cx">     };
</span><span class="cx"> 
</span><span class="cx">     bool testCharacterClass(CharacterClass* characterClass, int ch)
</span><span class="cx">     {
</span><del>-        if (ch &amp; 0xFF80) {
</del><ins>+        if (ch &amp; 0x1FFF80) {
</ins><span class="cx">             for (unsigned i = 0; i &lt; characterClass-&gt;m_matchesUnicode.size(); ++i)
</span><span class="cx">                 if (ch == characterClass-&gt;m_matchesUnicode[i])
</span><span class="cx">                     return true;
</span><span class="lines">@@ -309,6 +343,11 @@
</span><span class="cx">         return testChar == input.readChecked(negativeInputOffset);
</span><span class="cx">     }
</span><span class="cx"> 
</span><ins>+    bool checkSurrogatePair(int testUnicodeChar, unsigned negativeInputOffset)
+    {
+        return testUnicodeChar == input.readSurrogatePairChecked(negativeInputOffset);
+    }
+
</ins><span class="cx">     bool checkCasedCharacter(int loChar, int hiChar, unsigned negativeInputOffset)
</span><span class="cx">     {
</span><span class="cx">         int ch = input.readChecked(negativeInputOffset);
</span><span class="lines">@@ -328,32 +367,30 @@
</span><span class="cx">         if (!input.checkInput(matchSize))
</span><span class="cx">             return false;
</span><span class="cx"> 
</span><del>-        if (pattern-&gt;m_ignoreCase) {
-            for (unsigned i = 0; i &lt; matchSize; ++i) {
-                int oldCh = input.reread(matchBegin + i);
-                int ch = input.readChecked(negativeInputOffset + matchSize - i);
</del><ins>+        for (unsigned i = 0; i &lt; matchSize; ++i) {
+            int oldCh = input.reread(matchBegin + i);
+            int ch;
+            if (!U_IS_BMP(oldCh)) {
+                ch = input.readSurrogatePairChecked(negativeInputOffset + matchSize - i);
+                ++i;
+            } else
+                ch = input.readChecked(negativeInputOffset + matchSize - i);
</ins><span class="cx"> 
</span><del>-                if (oldCh == ch)
-                    continue;
</del><ins>+            if (oldCh == ch)
+                continue;
</ins><span class="cx"> 
</span><del>-                // The definition for canonicalize (see ES 5.1, 15.10.2.8) means that
</del><ins>+            if (pattern-&gt;m_ignoreCase) {
+                // The definition for canonicalize (see ES 6.0, 15.10.2.8) means that
</ins><span class="cx">                 // unicode values are never allowed to match against ascii ones.
</span><span class="cx">                 if (isASCII(oldCh) || isASCII(ch)) {
</span><span class="cx">                     if (toASCIIUpper(oldCh) == toASCIIUpper(ch))
</span><span class="cx">                         continue;
</span><del>-                } else if (areCanonicallyEquivalent(oldCh, ch))
</del><ins>+                } else if (areCanonicallyEquivalent(oldCh, ch, unicode ? CanonicalMode::Unicode : CanonicalMode::UCS2))
</ins><span class="cx">                     continue;
</span><ins>+            }
</ins><span class="cx"> 
</span><del>-                input.uncheckInput(matchSize);
-                return false;
-            }
-        } else {
-            for (unsigned i = 0; i &lt; matchSize; ++i) {
-                if (!checkCharacter(input.reread(matchBegin + i), negativeInputOffset + matchSize - i)) {
-                    input.uncheckInput(matchSize);
-                    return false;
-                }
-            }
</del><ins>+            input.uncheckInput(matchSize);
+            return false;
</ins><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         return true;
</span><span class="lines">@@ -396,7 +433,10 @@
</span><span class="cx">         case QuantifierGreedy:
</span><span class="cx">             if (backTrack-&gt;matchAmount) {
</span><span class="cx">                 --backTrack-&gt;matchAmount;
</span><del>-                input.uncheckInput(1);
</del><ins>+                if (unicode &amp;&amp; !U_IS_BMP(term.atom.patternCharacter))
+                    input.uncheckInput(2);
+                else
+                    input.uncheckInput(1);
</ins><span class="cx">                 return true;
</span><span class="cx">             }
</span><span class="cx">             break;
</span><span class="lines">@@ -407,7 +447,7 @@
</span><span class="cx">                 if (checkCharacter(term.atom.patternCharacter, term.inputPosition + 1))
</span><span class="cx">                     return true;
</span><span class="cx">             }
</span><del>-            input.uncheckInput(backTrack-&gt;matchAmount);
</del><ins>+            input.setPos(backTrack-&gt;begin);
</ins><span class="cx">             break;
</span><span class="cx">         }
</span><span class="cx"> 
</span><span class="lines">@@ -446,10 +486,23 @@
</span><span class="cx">     bool matchCharacterClass(ByteTerm&amp; term, DisjunctionContext* context)
</span><span class="cx">     {
</span><span class="cx">         ASSERT(term.type == ByteTerm::TypeCharacterClass);
</span><del>-        BackTrackInfoPatternCharacter* backTrack = reinterpret_cast&lt;BackTrackInfoPatternCharacter*&gt;(context-&gt;frame + term.frameLocation);
</del><ins>+        BackTrackInfoCharacterClass* backTrack = reinterpret_cast&lt;BackTrackInfoCharacterClass*&gt;(context-&gt;frame + term.frameLocation);
</ins><span class="cx"> 
</span><span class="cx">         switch (term.atom.quantityType) {
</span><span class="cx">         case QuantifierFixedCount: {
</span><ins>+            if (unicode) {
+                backTrack-&gt;begin = input.getPos();
+                unsigned matchAmount = 0;
+                for (matchAmount = 0; matchAmount &lt; term.atom.quantityCount; ++matchAmount) {
+                    if (!checkCharacterClass(term.atom.characterClass, term.invert(), term.inputPosition - matchAmount)) {
+                        input.setPos(backTrack-&gt;begin);
+                        return false;
+                    }
+                }
+
+                return true;
+            }
+
</ins><span class="cx">             for (unsigned matchAmount = 0; matchAmount &lt; term.atom.quantityCount; ++matchAmount) {
</span><span class="cx">                 if (!checkCharacterClass(term.atom.characterClass, term.invert(), term.inputPosition - matchAmount))
</span><span class="cx">                     return false;
</span><span class="lines">@@ -458,6 +511,7 @@
</span><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         case QuantifierGreedy: {
</span><ins>+            backTrack-&gt;begin = input.getPos();
</ins><span class="cx">             unsigned matchAmount = 0;
</span><span class="cx">             while ((matchAmount &lt; term.atom.quantityCount) &amp;&amp; input.checkInput(1)) {
</span><span class="cx">                 if (!checkCharacterClass(term.atom.characterClass, term.invert(), term.inputPosition + 1)) {
</span><span class="lines">@@ -472,6 +526,7 @@
</span><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         case QuantifierNonGreedy:
</span><ins>+            backTrack-&gt;begin = input.getPos();
</ins><span class="cx">             backTrack-&gt;matchAmount = 0;
</span><span class="cx">             return true;
</span><span class="cx">         }
</span><span class="lines">@@ -483,14 +538,28 @@
</span><span class="cx">     bool backtrackCharacterClass(ByteTerm&amp; term, DisjunctionContext* context)
</span><span class="cx">     {
</span><span class="cx">         ASSERT(term.type == ByteTerm::TypeCharacterClass);
</span><del>-        BackTrackInfoPatternCharacter* backTrack = reinterpret_cast&lt;BackTrackInfoPatternCharacter*&gt;(context-&gt;frame + term.frameLocation);
</del><ins>+        BackTrackInfoCharacterClass* backTrack = reinterpret_cast&lt;BackTrackInfoCharacterClass*&gt;(context-&gt;frame + term.frameLocation);
</ins><span class="cx"> 
</span><span class="cx">         switch (term.atom.quantityType) {
</span><span class="cx">         case QuantifierFixedCount:
</span><ins>+            if (unicode)
+                input.setPos(backTrack-&gt;begin);
</ins><span class="cx">             break;
</span><span class="cx"> 
</span><span class="cx">         case QuantifierGreedy:
</span><span class="cx">             if (backTrack-&gt;matchAmount) {
</span><ins>+                if (unicode) {
+                    // Rematch one less match
+                    input.setPos(backTrack-&gt;begin);
+                    --backTrack-&gt;matchAmount;
+                    for (unsigned matchAmount = 0; (matchAmount &lt; backTrack-&gt;matchAmount) &amp;&amp; input.checkInput(1); ++matchAmount) {
+                        if (!checkCharacterClass(term.atom.characterClass, term.invert(), term.inputPosition + 1)) {
+                            input.uncheckInput(1);
+                            break;
+                        }
+                    }
+                    return true;
+                }
</ins><span class="cx">                 --backTrack-&gt;matchAmount;
</span><span class="cx">                 input.uncheckInput(1);
</span><span class="cx">                 return true;
</span><span class="lines">@@ -503,7 +572,7 @@
</span><span class="cx">                 if (checkCharacterClass(term.atom.characterClass, term.invert(), term.inputPosition + 1))
</span><span class="cx">                     return true;
</span><span class="cx">             }
</span><del>-            input.uncheckInput(backTrack-&gt;matchAmount);
</del><ins>+            input.setPos(backTrack-&gt;begin);
</ins><span class="cx">             break;
</span><span class="cx">         }
</span><span class="cx"> 
</span><span class="lines">@@ -773,7 +842,7 @@
</span><span class="cx">         if (backTrack-&gt;begin == input.getPos())
</span><span class="cx">             return false;
</span><span class="cx"> 
</span><del>-        // Successful match! Okay, what's next? - loop around and try to match moar!
</del><ins>+        // Successful match! Okay, what's next? - loop around and try to match more!
</ins><span class="cx">         context-&gt;term -= (term.atom.parenthesesWidth + 1);
</span><span class="cx">         return true;
</span><span class="cx">     }
</span><span class="lines">@@ -1154,9 +1223,23 @@
</span><span class="cx"> 
</span><span class="cx">         case ByteTerm::TypePatternCharacterOnce:
</span><span class="cx">         case ByteTerm::TypePatternCharacterFixed: {
</span><ins>+            if (unicode) {
+                if (!U_IS_BMP(currentTerm().atom.patternCharacter)) {
+                    for (unsigned matchAmount = 0; matchAmount &lt; currentTerm().atom.quantityCount; ++matchAmount) {
+                        if (!checkSurrogatePair(currentTerm().atom.patternCharacter, currentTerm().inputPosition - matchAmount)) {
+                            BACKTRACK();
+                        }
+                    }
+                    MATCH_NEXT();
+                }
+            }
+            unsigned position = input.getPos(); // May need to back out reading a surrogate pair.
+
</ins><span class="cx">             for (unsigned matchAmount = 0; matchAmount &lt; currentTerm().atom.quantityCount; ++matchAmount) {
</span><del>-                if (!checkCharacter(currentTerm().atom.patternCharacter, currentTerm().inputPosition - matchAmount))
</del><ins>+                if (!checkCharacter(currentTerm().atom.patternCharacter, currentTerm().inputPosition - matchAmount)) {
+                    input.setPos(position);
</ins><span class="cx">                     BACKTRACK();
</span><ins>+                }
</ins><span class="cx">             }
</span><span class="cx">             MATCH_NEXT();
</span><span class="cx">         }
</span><span class="lines">@@ -1176,12 +1259,28 @@
</span><span class="cx">         }
</span><span class="cx">         case ByteTerm::TypePatternCharacterNonGreedy: {
</span><span class="cx">             BackTrackInfoPatternCharacter* backTrack = reinterpret_cast&lt;BackTrackInfoPatternCharacter*&gt;(context-&gt;frame + currentTerm().frameLocation);
</span><ins>+            backTrack-&gt;begin = input.getPos();
</ins><span class="cx">             backTrack-&gt;matchAmount = 0;
</span><span class="cx">             MATCH_NEXT();
</span><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         case ByteTerm::TypePatternCasedCharacterOnce:
</span><span class="cx">         case ByteTerm::TypePatternCasedCharacterFixed: {
</span><ins>+            if (unicode) {
+                // Case insensitive matching of unicode charaters are handled as TypeCharacterClass
+                ASSERT(U_IS_BMP(currentTerm().atom.patternCharacter));
+
+                unsigned position = input.getPos(); // May need to back out reading a surrogate pair.
+                
+                for (unsigned matchAmount = 0; matchAmount &lt; currentTerm().atom.quantityCount; ++matchAmount) {
+                    if (!checkCasedCharacter(currentTerm().atom.casedCharacter.lo, currentTerm().atom.casedCharacter.hi, currentTerm().inputPosition - matchAmount)) {
+                        input.setPos(position);
+                        BACKTRACK();
+                    }
+                }
+                MATCH_NEXT();
+            }
+
</ins><span class="cx">             for (unsigned matchAmount = 0; matchAmount &lt; currentTerm().atom.quantityCount; ++matchAmount) {
</span><span class="cx">                 if (!checkCasedCharacter(currentTerm().atom.casedCharacter.lo, currentTerm().atom.casedCharacter.hi, currentTerm().inputPosition - matchAmount))
</span><span class="cx">                     BACKTRACK();
</span><span class="lines">@@ -1190,6 +1289,10 @@
</span><span class="cx">         }
</span><span class="cx">         case ByteTerm::TypePatternCasedCharacterGreedy: {
</span><span class="cx">             BackTrackInfoPatternCharacter* backTrack = reinterpret_cast&lt;BackTrackInfoPatternCharacter*&gt;(context-&gt;frame + currentTerm().frameLocation);
</span><ins>+
+            // Case insensitive matching of unicode charaters are handled as TypeCharacterClass
+            ASSERT(!unicode || U_IS_BMP(currentTerm().atom.patternCharacter));
+
</ins><span class="cx">             unsigned matchAmount = 0;
</span><span class="cx">             while ((matchAmount &lt; currentTerm().atom.quantityCount) &amp;&amp; input.checkInput(1)) {
</span><span class="cx">                 if (!checkCasedCharacter(currentTerm().atom.casedCharacter.lo, currentTerm().atom.casedCharacter.hi, currentTerm().inputPosition + 1)) {
</span><span class="lines">@@ -1204,6 +1307,10 @@
</span><span class="cx">         }
</span><span class="cx">         case ByteTerm::TypePatternCasedCharacterNonGreedy: {
</span><span class="cx">             BackTrackInfoPatternCharacter* backTrack = reinterpret_cast&lt;BackTrackInfoPatternCharacter*&gt;(context-&gt;frame + currentTerm().frameLocation);
</span><ins>+
+            // Case insensitive matching of unicode charaters are handled as TypeCharacterClass
+            ASSERT(!unicode || U_IS_BMP(currentTerm().atom.patternCharacter));
+            
</ins><span class="cx">             backTrack-&gt;matchAmount = 0;
</span><span class="cx">             MATCH_NEXT();
</span><span class="cx">         }
</span><span class="lines">@@ -1439,8 +1546,9 @@
</span><span class="cx"> 
</span><span class="cx">     Interpreter(BytecodePattern* pattern, unsigned* output, const CharType* input, unsigned length, unsigned start)
</span><span class="cx">         : pattern(pattern)
</span><ins>+        , unicode(pattern-&gt;m_unicode)
</ins><span class="cx">         , output(output)
</span><del>-        , input(input, start, length)
</del><ins>+        , input(input, start, length, pattern-&gt;m_unicode)
</ins><span class="cx">         , allocatorPool(0)
</span><span class="cx">         , remainingMatchCount(matchLimit)
</span><span class="cx">     {
</span><span class="lines">@@ -1448,6 +1556,7 @@
</span><span class="cx"> 
</span><span class="cx"> private:
</span><span class="cx">     BytecodePattern* pattern;
</span><ins>+    bool unicode;
</ins><span class="cx">     unsigned* output;
</span><span class="cx">     InputStream input;
</span><span class="cx">     BumpPointerPool* allocatorPool;
</span><span class="lines">@@ -1506,14 +1615,14 @@
</span><span class="cx">         m_bodyDisjunction-&gt;terms.append(ByteTerm::WordBoundary(invert, inputPosition));
</span><span class="cx">     }
</span><span class="cx"> 
</span><del>-    void atomPatternCharacter(UChar ch, unsigned inputPosition, unsigned frameLocation, Checked&lt;unsigned&gt; quantityCount, QuantifierType quantityType)
</del><ins>+    void atomPatternCharacter(UChar32 ch, unsigned inputPosition, unsigned frameLocation, Checked&lt;unsigned&gt; quantityCount, QuantifierType quantityType)
</ins><span class="cx">     {
</span><span class="cx">         if (m_pattern.m_ignoreCase) {
</span><del>-            ASSERT(u_tolower(ch) &lt;= 0xFFFF);
-            ASSERT(u_toupper(ch) &lt;= 0xFFFF);
</del><ins>+            ASSERT(u_tolower(ch) &lt;= UCHAR_MAX_VALUE);
+            ASSERT(u_toupper(ch) &lt;= UCHAR_MAX_VALUE);
</ins><span class="cx"> 
</span><del>-            UChar lo = u_tolower(ch);
-            UChar hi = u_toupper(ch);
</del><ins>+            UChar32 lo = u_tolower(ch);
+            UChar32 hi = u_toupper(ch);
</ins><span class="cx"> 
</span><span class="cx">             if (lo != hi) {
</span><span class="cx">                 m_bodyDisjunction-&gt;terms.append(ByteTerm(lo, hi, inputPosition, frameLocation, quantityCount, quantityType));
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreyarrYarrInterpreterh"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/yarr/YarrInterpreter.h (197425 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/yarr/YarrInterpreter.h        2016-03-02 00:31:47 UTC (rev 197425)
+++ trunk/Source/JavaScriptCore/yarr/YarrInterpreter.h        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -1,5 +1,5 @@
</span><span class="cx"> /*
</span><del>- * Copyright (C) 2009, 2010 Apple Inc. All rights reserved.
</del><ins>+ * Copyright (C) 2009, 2010-2012, 2014, 2016 Apple Inc. All rights reserved.
</ins><span class="cx">  *
</span><span class="cx">  * Redistribution and use in source and binary forms, with or without
</span><span class="cx">  * modification, are permitted provided that the following conditions
</span><span class="lines">@@ -74,10 +74,10 @@
</span><span class="cx">     union {
</span><span class="cx">         struct {
</span><span class="cx">             union {
</span><del>-                UChar patternCharacter;
</del><ins>+                UChar32 patternCharacter;
</ins><span class="cx">                 struct {
</span><del>-                    UChar lo;
-                    UChar hi;
</del><ins>+                    UChar32 lo;
+                    UChar32 hi;
</ins><span class="cx">                 } casedCharacter;
</span><span class="cx">                 CharacterClass* characterClass;
</span><span class="cx">                 unsigned subpatternId;
</span><span class="lines">@@ -105,7 +105,7 @@
</span><span class="cx">     bool m_invert : 1;
</span><span class="cx">     unsigned inputPosition;
</span><span class="cx"> 
</span><del>-    ByteTerm(UChar ch, int inputPos, unsigned frameLocation, Checked&lt;unsigned&gt; quantityCount, QuantifierType quantityType)
</del><ins>+    ByteTerm(UChar32 ch, int inputPos, unsigned frameLocation, Checked&lt;unsigned&gt; quantityCount, QuantifierType quantityType)
</ins><span class="cx">         : frameLocation(frameLocation)
</span><span class="cx">         , m_capture(false)
</span><span class="cx">         , m_invert(false)
</span><span class="lines">@@ -128,7 +128,7 @@
</span><span class="cx">         inputPosition = inputPos;
</span><span class="cx">     }
</span><span class="cx"> 
</span><del>-    ByteTerm(UChar lo, UChar hi, int inputPos, unsigned frameLocation, Checked&lt;unsigned&gt; quantityCount, QuantifierType quantityType)
</del><ins>+    ByteTerm(UChar32 lo, UChar32 hi, int inputPos, unsigned frameLocation, Checked&lt;unsigned&gt; quantityCount, QuantifierType quantityType)
</ins><span class="cx">         : frameLocation(frameLocation)
</span><span class="cx">         , m_capture(false)
</span><span class="cx">         , m_invert(false)
</span><span class="lines">@@ -341,6 +341,7 @@
</span><span class="cx">         : m_body(WTFMove(body))
</span><span class="cx">         , m_ignoreCase(pattern.m_ignoreCase)
</span><span class="cx">         , m_multiline(pattern.m_multiline)
</span><ins>+        , m_unicode(pattern.m_unicode)
</ins><span class="cx">         , m_allocator(allocator)
</span><span class="cx">     {
</span><span class="cx">         m_body-&gt;terms.shrinkToFit();
</span><span class="lines">@@ -360,6 +361,7 @@
</span><span class="cx">     std::unique_ptr&lt;ByteDisjunction&gt; m_body;
</span><span class="cx">     bool m_ignoreCase;
</span><span class="cx">     bool m_multiline;
</span><ins>+    bool m_unicode;
</ins><span class="cx">     // Each BytecodePattern is associated with a RegExp, each RegExp is associated
</span><span class="cx">     // with a VM.  Cache a pointer to out VM's m_regExpAllocator.
</span><span class="cx">     BumpPointerAllocator* m_allocator;
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreyarrYarrJITcpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/yarr/YarrJIT.cpp (197425 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/yarr/YarrJIT.cpp        2016-03-02 00:31:47 UTC (rev 197425)
+++ trunk/Source/JavaScriptCore/yarr/YarrJIT.cpp        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -1,5 +1,5 @@
</span><span class="cx"> /*
</span><del>- * Copyright (C) 2009, 2013 Apple Inc. All rights reserved.
</del><ins>+ * Copyright (C) 2009, 2013, 2015-2016 Apple Inc. All rights reserved.
</ins><span class="cx">  *
</span><span class="cx">  * Redistribution and use in source and binary forms, with or without
</span><span class="cx">  * modification, are permitted provided that the following conditions
</span><span class="lines">@@ -30,7 +30,7 @@
</span><span class="cx"> #include &quot;LinkBuffer.h&quot;
</span><span class="cx"> #include &quot;Options.h&quot;
</span><span class="cx"> #include &quot;Yarr.h&quot;
</span><del>-#include &quot;YarrCanonicalizeUCS2.h&quot;
</del><ins>+#include &quot;YarrCanonicalizeUnicode.h&quot;
</ins><span class="cx"> 
</span><span class="cx"> #if ENABLE(YARR_JIT)
</span><span class="cx"> 
</span><span class="lines">@@ -140,7 +140,7 @@
</span><span class="cx">         }
</span><span class="cx">     }
</span><span class="cx"> 
</span><del>-    void matchCharacterClassRange(RegisterID character, JumpList&amp; failures, JumpList&amp; matchDest, const CharacterRange* ranges, unsigned count, unsigned* matchIndex, const UChar* matches, unsigned matchCount)
</del><ins>+    void matchCharacterClassRange(RegisterID character, JumpList&amp; failures, JumpList&amp; matchDest, const CharacterRange* ranges, unsigned count, unsigned* matchIndex, const UChar32* matches, unsigned matchCount)
</ins><span class="cx">     {
</span><span class="cx">         do {
</span><span class="cx">             // pick which range we're going to generate
</span><span class="lines">@@ -200,15 +200,15 @@
</span><span class="cx"> 
</span><span class="cx">             if (charClass-&gt;m_matchesUnicode.size()) {
</span><span class="cx">                 for (unsigned i = 0; i &lt; charClass-&gt;m_matchesUnicode.size(); ++i) {
</span><del>-                    UChar ch = charClass-&gt;m_matchesUnicode[i];
</del><ins>+                    UChar32 ch = charClass-&gt;m_matchesUnicode[i];
</ins><span class="cx">                     matchDest.append(branch32(Equal, character, Imm32(ch)));
</span><span class="cx">                 }
</span><span class="cx">             }
</span><span class="cx"> 
</span><span class="cx">             if (charClass-&gt;m_rangesUnicode.size()) {
</span><span class="cx">                 for (unsigned i = 0; i &lt; charClass-&gt;m_rangesUnicode.size(); ++i) {
</span><del>-                    UChar lo = charClass-&gt;m_rangesUnicode[i].begin;
-                    UChar hi = charClass-&gt;m_rangesUnicode[i].end;
</del><ins>+                    UChar32 lo = charClass-&gt;m_rangesUnicode[i].begin;
+                    UChar32 hi = charClass-&gt;m_rangesUnicode[i].end;
</ins><span class="cx"> 
</span><span class="cx">                     Jump below = branch32(LessThan, character, Imm32(lo));
</span><span class="cx">                     matchDest.append(branch32(LessThanOrEqual, character, Imm32(hi)));
</span><span class="lines">@@ -285,7 +285,7 @@
</span><span class="cx">         return branch32(NotEqual, index, length);
</span><span class="cx">     }
</span><span class="cx"> 
</span><del>-    Jump jumpIfCharNotEquals(UChar ch, int inputPosition, RegisterID character)
</del><ins>+    Jump jumpIfCharNotEquals(UChar32 ch, int inputPosition, RegisterID character)
</ins><span class="cx">     {
</span><span class="cx">         readCharacter(inputPosition, character);
</span><span class="cx"> 
</span><span class="lines">@@ -766,7 +766,7 @@
</span><span class="cx">         YarrOp* nextOp = &amp;m_ops[opIndex + 1];
</span><span class="cx"> 
</span><span class="cx">         PatternTerm* term = op.m_term;
</span><del>-        UChar ch = term-&gt;patternCharacter;
</del><ins>+        UChar32 ch = term-&gt;patternCharacter;
</ins><span class="cx"> 
</span><span class="cx">         if ((ch &gt; 0xff) &amp;&amp; (m_charSize == Char8)) {
</span><span class="cx">             // Have a 16 bit pattern character and an 8 bit string - short circuit
</span><span class="lines">@@ -813,7 +813,7 @@
</span><span class="cx">             int shiftAmount = (m_charSize == Char8 ? 8 : 16) * numberCharacters;
</span><span class="cx"> #endif
</span><span class="cx"> 
</span><del>-            UChar currentCharacter = nextTerm-&gt;patternCharacter;
</del><ins>+            UChar32 currentCharacter = nextTerm-&gt;patternCharacter;
</ins><span class="cx"> 
</span><span class="cx">             if ((currentCharacter &gt; 0xff) &amp;&amp; (m_charSize == Char8)) {
</span><span class="cx">                 // Have a 16 bit pattern character and an 8 bit string - short circuit
</span><span class="lines">@@ -882,7 +882,7 @@
</span><span class="cx">     {
</span><span class="cx">         YarrOp&amp; op = m_ops[opIndex];
</span><span class="cx">         PatternTerm* term = op.m_term;
</span><del>-        UChar ch = term-&gt;patternCharacter;
</del><ins>+        UChar32 ch = term-&gt;patternCharacter;
</ins><span class="cx"> 
</span><span class="cx">         const RegisterID character = regT0;
</span><span class="cx">         const RegisterID countRegister = regT1;
</span><span class="lines">@@ -919,7 +919,7 @@
</span><span class="cx">     {
</span><span class="cx">         YarrOp&amp; op = m_ops[opIndex];
</span><span class="cx">         PatternTerm* term = op.m_term;
</span><del>-        UChar ch = term-&gt;patternCharacter;
</del><ins>+        UChar32 ch = term-&gt;patternCharacter;
</ins><span class="cx"> 
</span><span class="cx">         const RegisterID character = regT0;
</span><span class="cx">         const RegisterID countRegister = regT1;
</span><span class="lines">@@ -977,7 +977,7 @@
</span><span class="cx">     {
</span><span class="cx">         YarrOp&amp; op = m_ops[opIndex];
</span><span class="cx">         PatternTerm* term = op.m_term;
</span><del>-        UChar ch = term-&gt;patternCharacter;
</del><ins>+        UChar32 ch = term-&gt;patternCharacter;
</ins><span class="cx"> 
</span><span class="cx">         const RegisterID character = regT0;
</span><span class="cx">         const RegisterID countRegister = regT1;
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreyarrYarrParserh"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/yarr/YarrParser.h (197425 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/yarr/YarrParser.h        2016-03-02 00:31:47 UTC (rev 197425)
+++ trunk/Source/JavaScriptCore/yarr/YarrParser.h        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -1,5 +1,5 @@
</span><span class="cx"> /*
</span><del>- * Copyright (C) 2009 Apple Inc. All rights reserved.
</del><ins>+ * Copyright (C) 2009, 2014-2016 Apple Inc. All rights reserved.
</ins><span class="cx">  *
</span><span class="cx">  * Redistribution and use in source and binary forms, with or without
</span><span class="cx">  * modification, are permitted provided that the following conditions
</span><span class="lines">@@ -46,7 +46,7 @@
</span><span class="cx"> class Parser {
</span><span class="cx"> private:
</span><span class="cx">     template&lt;class FriendDelegate&gt;
</span><del>-    friend const char* parse(FriendDelegate&amp;, const String&amp; pattern, unsigned backReferenceLimit);
</del><ins>+    friend const char* parse(FriendDelegate&amp;, const String&amp; pattern, bool isUnicode, unsigned backReferenceLimit);
</ins><span class="cx"> 
</span><span class="cx">     enum ErrorCode {
</span><span class="cx">         NoError,
</span><span class="lines">@@ -60,6 +60,7 @@
</span><span class="cx">         CharacterClassUnmatched,
</span><span class="cx">         CharacterClassOutOfOrder,
</span><span class="cx">         EscapeUnterminated,
</span><ins>+        InvalidUnicodeEscape,
</ins><span class="cx">         NumberOfErrorCodes
</span><span class="cx">     };
</span><span class="cx"> 
</span><span class="lines">@@ -101,7 +102,7 @@
</span><span class="cx">          * mode we will allow a hypen to be treated as indicating a range (i.e. /[a-z]/
</span><span class="cx">          * is different to /[a\-z]/).
</span><span class="cx">          */
</span><del>-        void atomPatternCharacter(UChar ch, bool hyphenIsRange = false)
</del><ins>+        void atomPatternCharacter(UChar32 ch, bool hyphenIsRange = false)
</ins><span class="cx">         {
</span><span class="cx">             switch (m_state) {
</span><span class="cx">             case AfterCharacterClass:
</span><span class="lines">@@ -225,16 +226,17 @@
</span><span class="cx">             AfterCharacterClass,
</span><span class="cx">             AfterCharacterClassHyphen,
</span><span class="cx">         } m_state;
</span><del>-        UChar m_character;
</del><ins>+        UChar32 m_character;
</ins><span class="cx">     };
</span><span class="cx"> 
</span><del>-    Parser(Delegate&amp; delegate, const String&amp; pattern, unsigned backReferenceLimit)
</del><ins>+    Parser(Delegate&amp; delegate, const String&amp; pattern, bool isUnicode, unsigned backReferenceLimit)
</ins><span class="cx">         : m_delegate(delegate)
</span><span class="cx">         , m_backReferenceLimit(backReferenceLimit)
</span><span class="cx">         , m_err(NoError)
</span><span class="cx">         , m_data(pattern.characters&lt;CharType&gt;())
</span><span class="cx">         , m_size(pattern.length())
</span><span class="cx">         , m_index(0)
</span><ins>+        , m_isUnicode(isUnicode)
</ins><span class="cx">         , m_parenthesesNestingDepth(0)
</span><span class="cx">     {
</span><span class="cx">     }
</span><span class="lines">@@ -411,11 +413,55 @@
</span><span class="cx">         // UnicodeEscape
</span><span class="cx">         case 'u': {
</span><span class="cx">             consume();
</span><ins>+            if (atEndOfPattern()) {
+                delegate.atomPatternCharacter('u');
+                break;
+            }
+
+            if (peek() == '{') {
+                consume();
+                UChar32 codePoint = 0;
+                do {
+                    if (atEndOfPattern())
+                        m_err = InvalidUnicodeEscape;
+                    if (!WTF::isASCIIHexDigit(peek()))
+                        m_err = InvalidUnicodeEscape;
+
+                    codePoint = (codePoint &lt;&lt; 4) | WTF::toASCIIHexValue(consume());
+
+                    if (codePoint &gt; UCHAR_MAX_VALUE)
+                        m_err = InvalidUnicodeEscape;
+                } while (!atEndOfPattern() &amp;&amp; peek() != '}');
+                if (!atEndOfPattern())
+                    consume();
+                if (m_err)
+                    return false;
+
+                delegate.atomPatternCharacter(codePoint);
+                break;
+            }
</ins><span class="cx">             int u = tryConsumeHex(4);
</span><span class="cx">             if (u == -1)
</span><span class="cx">                 delegate.atomPatternCharacter('u');
</span><del>-            else
</del><ins>+            else {
+                // If we have the first of a surrogate pair, look for the second.
+                if (U16_IS_LEAD(u) &amp;&amp; m_isUnicode &amp;&amp; (patternRemaining() &gt;= 6) &amp;&amp; peek() == '\\') {
+                    ParseState state = saveState();
+                    consume();
+                    
+                    if (tryConsume('u')) {
+                        int surrogate2 = tryConsumeHex(4);
+                        if (U16_IS_TRAIL(surrogate2)) {
+                            u = U16_GET_SUPPLEMENTARY(u, surrogate2);
+                            delegate.atomPatternCharacter(u);
+                            break;
+                        }
+                    }
+
+                    restoreState(state);
+                }
</ins><span class="cx">                 delegate.atomPatternCharacter(u);
</span><ins>+            }
</ins><span class="cx">             break;
</span><span class="cx">         }
</span><span class="cx"> 
</span><span class="lines">@@ -427,6 +473,22 @@
</span><span class="cx">         return true;
</span><span class="cx">     }
</span><span class="cx"> 
</span><ins>+    UChar32 consumePossibleSurrogatePair()
+    {
+        UChar32 ch = consume();
+        if (U16_IS_LEAD(ch) &amp;&amp; m_isUnicode &amp;&amp; (patternRemaining() &gt; 0)) {
+            ParseState state = saveState();
+
+            UChar32 surrogate2 = consume();
+            if (U16_IS_TRAIL(surrogate2))
+                ch = U16_GET_SUPPLEMENTARY(ch, surrogate2);
+            else
+                restoreState(state);
+        }
+
+        return ch;
+    }
+
</ins><span class="cx">     /*
</span><span class="cx">      * parseAtomEscape(), parseCharacterClassEscape():
</span><span class="cx">      *
</span><span class="lines">@@ -470,7 +532,7 @@
</span><span class="cx">                 break;
</span><span class="cx"> 
</span><span class="cx">             default:
</span><del>-                characterClassConstructor.atomPatternCharacter(consume(), true);
</del><ins>+                characterClassConstructor.atomPatternCharacter(consumePossibleSurrogatePair(), true);
</ins><span class="cx">             }
</span><span class="cx"> 
</span><span class="cx">             if (m_err)
</span><span class="lines">@@ -662,7 +724,7 @@
</span><span class="cx">             FALLTHROUGH;
</span><span class="cx"> 
</span><span class="cx">             default:
</span><del>-                m_delegate.atomPatternCharacter(consume());
</del><ins>+                m_delegate.atomPatternCharacter(consumePossibleSurrogatePair());
</ins><span class="cx">                 lastTokenWasAnAtom = true;
</span><span class="cx">             }
</span><span class="cx"> 
</span><span class="lines">@@ -701,6 +763,7 @@
</span><span class="cx">             REGEXP_ERROR_PREFIX &quot;missing terminating ] for character class&quot;,
</span><span class="cx">             REGEXP_ERROR_PREFIX &quot;range out of order in character class&quot;,
</span><span class="cx">             REGEXP_ERROR_PREFIX &quot;\\ at end of pattern&quot;
</span><ins>+            REGEXP_ERROR_PREFIX &quot;invalid unicode {} escape&quot;
</ins><span class="cx">         };
</span><span class="cx"> 
</span><span class="cx">         return errorMessages[m_err];
</span><span class="lines">@@ -726,6 +789,12 @@
</span><span class="cx">         return m_index == m_size;
</span><span class="cx">     }
</span><span class="cx"> 
</span><ins>+    unsigned patternRemaining()
+    {
+        ASSERT(m_index &lt;= m_size);
+        return m_size - m_index;
+    }
+
</ins><span class="cx">     int peek()
</span><span class="cx">     {
</span><span class="cx">         ASSERT(m_index &lt; m_size);
</span><span class="lines">@@ -805,6 +874,7 @@
</span><span class="cx">     const CharType* m_data;
</span><span class="cx">     unsigned m_size;
</span><span class="cx">     unsigned m_index;
</span><ins>+    bool m_isUnicode;
</ins><span class="cx">     unsigned m_parenthesesNestingDepth;
</span><span class="cx"> 
</span><span class="cx">     // Derived by empirical testing of compile time in PCRE and WREC.
</span><span class="lines">@@ -825,11 +895,11 @@
</span><span class="cx">  *    void assertionEOL();
</span><span class="cx">  *    void assertionWordBoundary(bool invert);
</span><span class="cx">  *
</span><del>- *    void atomPatternCharacter(UChar ch);
</del><ins>+ *    void atomPatternCharacter(UChar32 ch);
</ins><span class="cx">  *    void atomBuiltInCharacterClass(BuiltInCharacterClassID classID, bool invert);
</span><span class="cx">  *    void atomCharacterClassBegin(bool invert)
</span><del>- *    void atomCharacterClassAtom(UChar ch)
- *    void atomCharacterClassRange(UChar begin, UChar end)
</del><ins>+ *    void atomCharacterClassAtom(UChar32 ch)
+ *    void atomCharacterClassRange(UChar32 begin, UChar32 end)
</ins><span class="cx">  *    void atomCharacterClassBuiltIn(BuiltInCharacterClassID classID, bool invert)
</span><span class="cx">  *    void atomCharacterClassEnd()
</span><span class="cx">  *    void atomParenthesesSubpatternBegin(bool capture = true);
</span><span class="lines">@@ -871,11 +941,11 @@
</span><span class="cx">  */
</span><span class="cx"> 
</span><span class="cx"> template&lt;class Delegate&gt;
</span><del>-const char* parse(Delegate&amp; delegate, const String&amp; pattern, unsigned backReferenceLimit = quantifyInfinite)
</del><ins>+const char* parse(Delegate&amp; delegate, const String&amp; pattern, bool isUnicode, unsigned backReferenceLimit = quantifyInfinite)
</ins><span class="cx"> {
</span><span class="cx">     if (pattern.is8Bit())
</span><del>-        return Parser&lt;Delegate, LChar&gt;(delegate, pattern, backReferenceLimit).parse();
-    return Parser&lt;Delegate, UChar&gt;(delegate, pattern, backReferenceLimit).parse();
</del><ins>+        return Parser&lt;Delegate, LChar&gt;(delegate, pattern, isUnicode, backReferenceLimit).parse();
+    return Parser&lt;Delegate, UChar&gt;(delegate, pattern, isUnicode, backReferenceLimit).parse();
</ins><span class="cx"> }
</span><span class="cx"> 
</span><span class="cx"> } } // namespace JSC::Yarr
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreyarrYarrPatterncpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/yarr/YarrPattern.cpp (197425 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/yarr/YarrPattern.cpp        2016-03-02 00:31:47 UTC (rev 197425)
+++ trunk/Source/JavaScriptCore/yarr/YarrPattern.cpp        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -1,5 +1,5 @@
</span><span class="cx"> /*
</span><del>- * Copyright (C) 2009, 2013 Apple Inc. All rights reserved.
</del><ins>+ * Copyright (C) 2009, 2013-2016 Apple Inc. All rights reserved.
</ins><span class="cx">  * Copyright (C) 2010 Peter Varga (pvarga@inf.u-szeged.hu), University of Szeged
</span><span class="cx">  *
</span><span class="cx">  * Redistribution and use in source and binary forms, with or without
</span><span class="lines">@@ -28,7 +28,7 @@
</span><span class="cx"> #include &quot;YarrPattern.h&quot;
</span><span class="cx"> 
</span><span class="cx"> #include &quot;Yarr.h&quot;
</span><del>-#include &quot;YarrCanonicalizeUCS2.h&quot;
</del><ins>+#include &quot;YarrCanonicalizeUnicode.h&quot;
</ins><span class="cx"> #include &quot;YarrParser.h&quot;
</span><span class="cx"> #include &lt;wtf/Vector.h&gt;
</span><span class="cx"> 
</span><span class="lines">@@ -40,8 +40,9 @@
</span><span class="cx"> 
</span><span class="cx"> class CharacterClassConstructor {
</span><span class="cx"> public:
</span><del>-    CharacterClassConstructor(bool isCaseInsensitive = false)
</del><ins>+    CharacterClassConstructor(bool isCaseInsensitive, CanonicalMode canonicalMode)
</ins><span class="cx">         : m_isCaseInsensitive(isCaseInsensitive)
</span><ins>+        , m_canonicalMode(canonicalMode)
</ins><span class="cx">     {
</span><span class="cx">     }
</span><span class="cx">     
</span><span class="lines">@@ -65,7 +66,7 @@
</span><span class="cx">             addSortedRange(m_rangesUnicode, other-&gt;m_rangesUnicode[i].begin, other-&gt;m_rangesUnicode[i].end);
</span><span class="cx">     }
</span><span class="cx"> 
</span><del>-    void putChar(UChar ch)
</del><ins>+    void putChar(UChar32 ch)
</ins><span class="cx">     {
</span><span class="cx">         // Handle ascii cases.
</span><span class="cx">         if (ch &lt;= 0x7f) {
</span><span class="lines">@@ -84,33 +85,32 @@
</span><span class="cx">         }
</span><span class="cx"> 
</span><span class="cx">         // Add multiple matches, if necessary.
</span><del>-        const UCS2CanonicalizationRange* info = rangeInfoFor(ch);
</del><ins>+        const CanonicalizationRange* info = canonicalRangeInfoFor(ch, m_canonicalMode);
</ins><span class="cx">         if (info-&gt;type == CanonicalizeUnique)
</span><span class="cx">             addSorted(m_matchesUnicode, ch);
</span><span class="cx">         else
</span><span class="cx">             putUnicodeIgnoreCase(ch, info);
</span><span class="cx">     }
</span><span class="cx"> 
</span><del>-    void putUnicodeIgnoreCase(UChar ch, const UCS2CanonicalizationRange* info)
</del><ins>+    void putUnicodeIgnoreCase(UChar32 ch, const CanonicalizationRange* info)
</ins><span class="cx">     {
</span><span class="cx">         ASSERT(m_isCaseInsensitive);
</span><del>-        ASSERT(ch &gt; 0x7f);
</del><span class="cx">         ASSERT(ch &gt;= info-&gt;begin &amp;&amp; ch &lt;= info-&gt;end);
</span><span class="cx">         ASSERT(info-&gt;type != CanonicalizeUnique);
</span><span class="cx">         if (info-&gt;type == CanonicalizeSet) {
</span><del>-            for (const uint16_t* set = characterSetInfo[info-&gt;value]; (ch = *set); ++set)
-                addSorted(m_matchesUnicode, ch);
</del><ins>+            for (const UChar32* set = canonicalCharacterSetInfo(info-&gt;value, m_canonicalMode); (ch = *set); ++set)
+                addSorted(ch);
</ins><span class="cx">         } else {
</span><del>-            addSorted(m_matchesUnicode, ch);
-            addSorted(m_matchesUnicode, getCanonicalPair(info, ch));
</del><ins>+            addSorted(ch);
+            addSorted(getCanonicalPair(info, ch));
</ins><span class="cx">         }
</span><span class="cx">     }
</span><span class="cx"> 
</span><del>-    void putRange(UChar lo, UChar hi)
</del><ins>+    void putRange(UChar32 lo, UChar32 hi)
</ins><span class="cx">     {
</span><span class="cx">         if (lo &lt;= 0x7f) {
</span><span class="cx">             char asciiLo = lo;
</span><del>-            char asciiHi = std::min(hi, (UChar)0x7f);
</del><ins>+            char asciiHi = std::min(hi, (UChar32)0x7f);
</ins><span class="cx">             addSortedRange(m_ranges, lo, asciiHi);
</span><span class="cx">             
</span><span class="cx">             if (m_isCaseInsensitive) {
</span><span class="lines">@@ -123,16 +123,16 @@
</span><span class="cx">         if (hi &lt;= 0x7f)
</span><span class="cx">             return;
</span><span class="cx"> 
</span><del>-        lo = std::max(lo, (UChar)0x80);
</del><ins>+        lo = std::max(lo, (UChar32)0x80);
</ins><span class="cx">         addSortedRange(m_rangesUnicode, lo, hi);
</span><span class="cx">         
</span><span class="cx">         if (!m_isCaseInsensitive)
</span><span class="cx">             return;
</span><span class="cx"> 
</span><del>-        const UCS2CanonicalizationRange* info = rangeInfoFor(lo);
</del><ins>+        const CanonicalizationRange* info = canonicalRangeInfoFor(lo, m_canonicalMode);
</ins><span class="cx">         while (true) {
</span><span class="cx">             // Handle the range [lo .. end]
</span><del>-            UChar end = std::min&lt;UChar&gt;(info-&gt;end, hi);
</del><ins>+            UChar32 end = std::min&lt;UChar32&gt;(info-&gt;end, hi);
</ins><span class="cx"> 
</span><span class="cx">             switch (info-&gt;type) {
</span><span class="cx">             case CanonicalizeUnique:
</span><span class="lines">@@ -140,7 +140,7 @@
</span><span class="cx">                 break;
</span><span class="cx">             case CanonicalizeSet: {
</span><span class="cx">                 UChar ch;
</span><del>-                for (const uint16_t* set = characterSetInfo[info-&gt;value]; (ch = *set); ++set)
</del><ins>+                for (const UChar32* set = canonicalCharacterSetInfo(info-&gt;value, m_canonicalMode); (ch = *set); ++set)
</ins><span class="cx">                     addSorted(m_matchesUnicode, ch);
</span><span class="cx">                 break;
</span><span class="cx">             }
</span><span class="lines">@@ -188,8 +188,13 @@
</span><span class="cx">     }
</span><span class="cx"> 
</span><span class="cx"> private:
</span><del>-    void addSorted(Vector&lt;UChar&gt;&amp; matches, UChar ch)
</del><ins>+    void addSorted(UChar32 ch)
</ins><span class="cx">     {
</span><ins>+        addSorted(ch &lt;= 0x7f ? m_matches : m_matchesUnicode, ch);
+    }
+
+    void addSorted(Vector&lt;UChar32&gt;&amp; matches, UChar32 ch)
+    {
</ins><span class="cx">         unsigned pos = 0;
</span><span class="cx">         unsigned range = matches.size();
</span><span class="cx"> 
</span><span class="lines">@@ -214,7 +219,7 @@
</span><span class="cx">             matches.insert(pos, ch);
</span><span class="cx">     }
</span><span class="cx"> 
</span><del>-    void addSortedRange(Vector&lt;CharacterRange&gt;&amp; ranges, UChar lo, UChar hi)
</del><ins>+    void addSortedRange(Vector&lt;CharacterRange&gt;&amp; ranges, UChar32 lo, UChar32 hi)
</ins><span class="cx">     {
</span><span class="cx">         unsigned end = ranges.size();
</span><span class="cx">         
</span><span class="lines">@@ -260,10 +265,11 @@
</span><span class="cx">     }
</span><span class="cx"> 
</span><span class="cx">     bool m_isCaseInsensitive;
</span><ins>+    CanonicalMode m_canonicalMode;
</ins><span class="cx"> 
</span><del>-    Vector&lt;UChar&gt; m_matches;
</del><ins>+    Vector&lt;UChar32&gt; m_matches;
</ins><span class="cx">     Vector&lt;CharacterRange&gt; m_ranges;
</span><del>-    Vector&lt;UChar&gt; m_matchesUnicode;
</del><ins>+    Vector&lt;UChar32&gt; m_matchesUnicode;
</ins><span class="cx">     Vector&lt;CharacterRange&gt; m_rangesUnicode;
</span><span class="cx"> };
</span><span class="cx"> 
</span><span class="lines">@@ -271,7 +277,7 @@
</span><span class="cx"> public:
</span><span class="cx">     YarrPatternConstructor(YarrPattern&amp; pattern)
</span><span class="cx">         : m_pattern(pattern)
</span><del>-        , m_characterClassConstructor(pattern.m_ignoreCase)
</del><ins>+        , m_characterClassConstructor(pattern.m_ignoreCase, pattern.m_unicode ? CanonicalMode::Unicode : CanonicalMode::UCS2)
</ins><span class="cx">         , m_invertParentheticalAssertion(false)
</span><span class="cx">     {
</span><span class="cx">         auto body = std::make_unique&lt;PatternDisjunction&gt;();
</span><span class="lines">@@ -313,16 +319,16 @@
</span><span class="cx">         m_alternative-&gt;m_terms.append(PatternTerm::WordBoundary(invert));
</span><span class="cx">     }
</span><span class="cx"> 
</span><del>-    void atomPatternCharacter(UChar ch)
</del><ins>+    void atomPatternCharacter(UChar32 ch)
</ins><span class="cx">     {
</span><span class="cx">         // We handle case-insensitive checking of unicode characters which do have both
</span><span class="cx">         // cases by handling them as if they were defined using a CharacterClass.
</span><del>-        if (!m_pattern.m_ignoreCase || isASCII(ch)) {
</del><ins>+        if (!m_pattern.m_ignoreCase || (isASCII(ch) &amp;&amp; !m_pattern.m_unicode)) {
</ins><span class="cx">             m_alternative-&gt;m_terms.append(PatternTerm(ch));
</span><span class="cx">             return;
</span><span class="cx">         }
</span><span class="cx"> 
</span><del>-        const UCS2CanonicalizationRange* info = rangeInfoFor(ch);
</del><ins>+        const CanonicalizationRange* info = canonicalRangeInfoFor(ch, m_pattern.m_unicode ? CanonicalMode::Unicode : CanonicalMode::UCS2);
</ins><span class="cx">         if (info-&gt;type == CanonicalizeUnique) {
</span><span class="cx">             m_alternative-&gt;m_terms.append(PatternTerm(ch));
</span><span class="cx">             return;
</span><span class="lines">@@ -357,12 +363,12 @@
</span><span class="cx">         m_invertCharacterClass = invert;
</span><span class="cx">     }
</span><span class="cx"> 
</span><del>-    void atomCharacterClassAtom(UChar ch)
</del><ins>+    void atomCharacterClassAtom(UChar32 ch)
</ins><span class="cx">     {
</span><span class="cx">         m_characterClassConstructor.putChar(ch);
</span><span class="cx">     }
</span><span class="cx"> 
</span><del>-    void atomCharacterClassRange(UChar begin, UChar end)
</del><ins>+    void atomCharacterClassRange(UChar32 begin, UChar32 end)
</ins><span class="cx">     {
</span><span class="cx">         m_characterClassConstructor.putRange(begin, end);
</span><span class="cx">     }
</span><span class="lines">@@ -596,6 +602,8 @@
</span><span class="cx">                     term.frameLocation = currentCallFrameSize;
</span><span class="cx">                     currentCallFrameSize += YarrStackSpaceForBackTrackInfoPatternCharacter;
</span><span class="cx">                     alternative-&gt;m_hasFixedSize = false;
</span><ins>+                } else if (m_pattern.m_unicode) {
+                    currentInputPosition += (!U_IS_BMP(term.patternCharacter) ? 2 : 1) * term.quantityCount;
</ins><span class="cx">                 } else
</span><span class="cx">                     currentInputPosition += term.quantityCount;
</span><span class="cx">                 break;
</span><span class="lines">@@ -606,6 +614,11 @@
</span><span class="cx">                     term.frameLocation = currentCallFrameSize;
</span><span class="cx">                     currentCallFrameSize += YarrStackSpaceForBackTrackInfoCharacterClass;
</span><span class="cx">                     alternative-&gt;m_hasFixedSize = false;
</span><ins>+                } else if (m_pattern.m_unicode) {
+                    term.frameLocation = currentCallFrameSize;
+                    currentCallFrameSize += YarrStackSpaceForBackTrackInfoCharacterClass;
+                    currentInputPosition += term.quantityCount;
+                    alternative-&gt;m_hasFixedSize = false;
</ins><span class="cx">                 } else
</span><span class="cx">                     currentInputPosition += term.quantityCount;
</span><span class="cx">                 break;
</span><span class="lines">@@ -832,7 +845,7 @@
</span><span class="cx"> {
</span><span class="cx">     YarrPatternConstructor constructor(*this);
</span><span class="cx"> 
</span><del>-    if (const char* error = parse(constructor, patternString))
</del><ins>+    if (const char* error = parse(constructor, patternString, m_unicode))
</ins><span class="cx">         return error;
</span><span class="cx">     
</span><span class="cx">     // If the pattern contains illegal backreferences reset &amp; reparse.
</span><span class="lines">@@ -846,7 +859,7 @@
</span><span class="cx"> #if !ASSERT_DISABLED
</span><span class="cx">         const char* error =
</span><span class="cx"> #endif
</span><del>-            parse(constructor, patternString, numSubpatterns);
</del><ins>+            parse(constructor, patternString, m_unicode, numSubpatterns);
</ins><span class="cx"> 
</span><span class="cx">         ASSERT(!error);
</span><span class="cx">         ASSERT(numSubpatterns == m_numSubpatterns);
</span><span class="lines">@@ -861,9 +874,10 @@
</span><span class="cx">     return 0;
</span><span class="cx"> }
</span><span class="cx"> 
</span><del>-YarrPattern::YarrPattern(const String&amp; pattern, bool ignoreCase, bool multiline, const char** error)
</del><ins>+YarrPattern::YarrPattern(const String&amp; pattern, bool ignoreCase, bool multiline, bool unicode, const char** error)
</ins><span class="cx">     : m_ignoreCase(ignoreCase)
</span><span class="cx">     , m_multiline(multiline)
</span><ins>+    , m_unicode(unicode)
</ins><span class="cx">     , m_containsBackreferences(false)
</span><span class="cx">     , m_containsBOL(false)
</span><span class="cx">     , m_containsUnsignedLengthPattern(false)
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreyarrYarrPatternh"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/yarr/YarrPattern.h (197425 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/yarr/YarrPattern.h        2016-03-02 00:31:47 UTC (rev 197425)
+++ trunk/Source/JavaScriptCore/yarr/YarrPattern.h        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -1,5 +1,5 @@
</span><span class="cx"> /*
</span><del>- * Copyright (C) 2009, 2013 Apple Inc. All rights reserved.
</del><ins>+ * Copyright (C) 2009, 2013-2014, 2016 Apple Inc. All rights reserved.
</ins><span class="cx">  * Copyright (C) 2010 Peter Varga (pvarga@inf.u-szeged.hu), University of Szeged
</span><span class="cx">  *
</span><span class="cx">  * Redistribution and use in source and binary forms, with or without
</span><span class="lines">@@ -37,10 +37,10 @@
</span><span class="cx"> struct PatternDisjunction;
</span><span class="cx"> 
</span><span class="cx"> struct CharacterRange {
</span><del>-    UChar begin;
-    UChar end;
</del><ins>+    UChar32 begin;
+    UChar32 end;
</ins><span class="cx"> 
</span><del>-    CharacterRange(UChar begin, UChar end)
</del><ins>+    CharacterRange(UChar32 begin, UChar32 end)
</ins><span class="cx">         : begin(begin)
</span><span class="cx">         , end(end)
</span><span class="cx">     {
</span><span class="lines">@@ -62,9 +62,9 @@
</span><span class="cx">         , m_tableInverted(inverted)
</span><span class="cx">     {
</span><span class="cx">     }
</span><del>-    Vector&lt;UChar&gt; m_matches;
</del><ins>+    Vector&lt;UChar32&gt; m_matches;
</ins><span class="cx">     Vector&lt;CharacterRange&gt; m_ranges;
</span><del>-    Vector&lt;UChar&gt; m_matchesUnicode;
</del><ins>+    Vector&lt;UChar32&gt; m_matchesUnicode;
</ins><span class="cx">     Vector&lt;CharacterRange&gt; m_rangesUnicode;
</span><span class="cx"> 
</span><span class="cx">     const char* m_table;
</span><span class="lines">@@ -93,7 +93,7 @@
</span><span class="cx">     bool m_capture :1;
</span><span class="cx">     bool m_invert :1;
</span><span class="cx">     union {
</span><del>-        UChar patternCharacter;
</del><ins>+        UChar32 patternCharacter;
</ins><span class="cx">         CharacterClass* characterClass;
</span><span class="cx">         unsigned backReferenceSubpatternId;
</span><span class="cx">         struct {
</span><span class="lines">@@ -113,7 +113,7 @@
</span><span class="cx">     int inputPosition;
</span><span class="cx">     unsigned frameLocation;
</span><span class="cx"> 
</span><del>-    PatternTerm(UChar ch)
</del><ins>+    PatternTerm(UChar32 ch)
</ins><span class="cx">         : type(PatternTerm::TypePatternCharacter)
</span><span class="cx">         , m_capture(false)
</span><span class="cx">         , m_invert(false)
</span><span class="lines">@@ -300,7 +300,7 @@
</span><span class="cx"> };
</span><span class="cx"> 
</span><span class="cx"> struct YarrPattern {
</span><del>-    JS_EXPORT_PRIVATE YarrPattern(const String&amp; pattern, bool ignoreCase, bool multiline, const char** error);
</del><ins>+    JS_EXPORT_PRIVATE YarrPattern(const String&amp; pattern, bool ignoreCase, bool multiline, bool unicode, const char** error);
</ins><span class="cx"> 
</span><span class="cx">     void reset()
</span><span class="cx">     {
</span><span class="lines">@@ -392,6 +392,7 @@
</span><span class="cx"> 
</span><span class="cx">     bool m_ignoreCase : 1;
</span><span class="cx">     bool m_multiline : 1;
</span><ins>+    bool m_unicode : 1;
</ins><span class="cx">     bool m_containsBackreferences : 1;
</span><span class="cx">     bool m_containsBOL : 1;
</span><span class="cx">     bool m_containsUnsignedLengthPattern : 1; 
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreyarrYarrSyntaxCheckercpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/yarr/YarrSyntaxChecker.cpp (197425 => 197426)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/yarr/YarrSyntaxChecker.cpp        2016-03-02 00:31:47 UTC (rev 197425)
+++ trunk/Source/JavaScriptCore/yarr/YarrSyntaxChecker.cpp        2016-03-02 00:39:01 UTC (rev 197426)
</span><span class="lines">@@ -1,5 +1,5 @@
</span><span class="cx"> /*
</span><del>- * Copyright (C) 2011 Apple Inc. All rights reserved.
</del><ins>+ * Copyright (C) 2011, 2016 Apple Inc. All rights reserved.
</ins><span class="cx">  *
</span><span class="cx">  * Redistribution and use in source and binary forms, with or without
</span><span class="cx">  * modification, are permitted provided that the following conditions
</span><span class="lines">@@ -35,7 +35,7 @@
</span><span class="cx">     void assertionBOL() {}
</span><span class="cx">     void assertionEOL() {}
</span><span class="cx">     void assertionWordBoundary(bool) {}
</span><del>-    void atomPatternCharacter(UChar) {}
</del><ins>+    void atomPatternCharacter(UChar32) {}
</ins><span class="cx">     void atomBuiltInCharacterClass(BuiltInCharacterClassID, bool) {}
</span><span class="cx">     void atomCharacterClassBegin(bool = false) {}
</span><span class="cx">     void atomCharacterClassAtom(UChar) {}
</span><span class="lines">@@ -53,7 +53,7 @@
</span><span class="cx"> const char* checkSyntax(const String&amp; pattern)
</span><span class="cx"> {
</span><span class="cx">     SyntaxChecker syntaxChecker;
</span><del>-    return parse(syntaxChecker, pattern);
</del><ins>+    return parse(syntaxChecker, pattern, false);
</ins><span class="cx"> }
</span><span class="cx"> 
</span><span class="cx"> }} // JSC::YARR
</span></span></pre>
</div>
</div>

</body>
</html>