<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"

"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">

<head><meta http-equiv="content-type" content="text/html; charset=utf-8" />

<title>[183552] trunk</title>

</head>

<body>

<style type="text/css"><!--

#msg dl.meta { border: 1px #006 solid; background: #369; padding: 6px; color: #fff; }

#msg dl.meta dt { float: left; width: 6em; font-weight: bold; }

#msg dt:after { content:':';}

#msg dl, #msg dt, #msg ul, #msg li, #header, #footer, #logmsg { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt;  }

#msg dl a { font-weight: bold}

#msg dl a:link    { color:#fc3; }

#msg dl a:active  { color:#ff0; }

#msg dl a:visited { color:#cc6; }

h3 { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt; font-weight: bold; }

#msg pre { overflow: auto; background: #ffc; border: 1px #fa0 solid; padding: 6px; }

#logmsg { background: #ffc; border: 1px #fa0 solid; padding: 1em 1em 0 1em; }

#logmsg p, #logmsg pre, #logmsg blockquote { margin: 0 0 1em 0; }

#logmsg p, #logmsg li, #logmsg dt, #logmsg dd { line-height: 14pt; }

#logmsg h1, #logmsg h2, #logmsg h3, #logmsg h4, #logmsg h5, #logmsg h6 { margin: .5em 0; }

#logmsg h1:first-child, #logmsg h2:first-child, #logmsg h3:first-child, #logmsg h4:first-child, #logmsg h5:first-child, #logmsg h6:first-child { margin-top: 0; }

#logmsg ul, #logmsg ol { padding: 0; list-style-position: inside; margin: 0 0 0 1em; }

#logmsg ul { text-indent: -1em; padding-left: 1em; }#logmsg ol { text-indent: -1.5em; padding-left: 1.5em; }

#logmsg > ul, #logmsg > ol { margin: 0 0 1em 0; }

#logmsg pre { background: #eee; padding: 1em; }

#logmsg blockquote { border: 1px solid #fa0; border-left-width: 10px; padding: 1em 1em 0 1em; background: white;}

#logmsg dl { margin: 0; }

#logmsg dt { font-weight: bold; }

#logmsg dd { margin: 0; padding: 0 0 0.5em 0; }

#logmsg dd:before { content:'\00bb';}

#logmsg table { border-spacing: 0px; border-collapse: collapse; border-top: 4px solid #fa0; border-bottom: 1px solid #fa0; background: #fff; }

#logmsg table th { text-align: left; font-weight: normal; padding: 0.2em 0.5em; border-top: 1px dotted #fa0; }

#logmsg table td { text-align: right; border-top: 1px dotted #fa0; padding: 0.2em 0.5em; }

#logmsg table thead th { text-align: center; border-bottom: 1px solid #fa0; }

#logmsg table th.Corner { text-align: left; }

#logmsg hr { border: none 0; border-top: 2px dashed #fa0; height: 1px; }

#header, #footer { color: #fff; background: #636; border: 1px #300 solid; padding: 6px; }

#patch { width: 100%; }

#patch h4 {font-family: verdana,arial,helvetica,sans-serif;font-size:10pt;padding:8px;background:#369;color:#fff;margin:0;}

#patch .propset h4, #patch .binary h4 {margin:0;}

#patch pre {padding:0;line-height:1.2em;margin:0;}

#patch .diff {width:100%;background:#eee;padding: 0 0 10px 0;overflow:auto;}

#patch .propset .diff, #patch .binary .diff  {padding:10px 0;}

#patch span {display:block;padding:0 10px;}

#patch .modfile, #patch .addfile, #patch .delfile, #patch .propset, #patch .binary, #patch .copfile {border:1px solid #ccc;margin:10px 0;}

#patch ins {background:#dfd;text-decoration:none;display:block;padding:0 10px;}

#patch del {background:#fdd;text-decoration:none;display:block;padding:0 10px;}

#patch .lines, .info {color:#888;background:#fff;}

--></style>

<div id="msg">

<dl class="meta">

<dt>Revision</dt> <dd><a href="http://trac.webkit.org/projects/webkit/changeset/183552">183552</a></dd>

<dt>Author</dt> <dd>darin@apple.com</dd>

<dt>Date</dt> <dd>2015-04-29 09:33:12 -0700 (Wed, 29 Apr 2015)</dd>

</dl>

<h3>Log Message</h3>

<pre>[ES6] Implement Unicode code point escapes

https://bugs.webkit.org/show_bug.cgi?id=144377

Reviewed by Antti Koivisto.

Source/JavaScriptCore:

* parser/Lexer.cpp: Moved the UnicodeHexValue class in here from

the header. Made it a non-member class so it doesn't need to be part

of a template. Made it use UChar32 instead of int for the value to

make it clearer what goes into this class.

(JSC::ParsedUnicodeEscapeValue::isIncomplete): Added. Replaces the

old type() function.

(JSC::Lexer&lt;CharacterType&gt;::parseUnicodeEscape): Renamed from

parseFourDigitUnicodeHex and added support for code point escapes.

(JSC::isLatin1): Added an overload for UChar32.

(JSC::isIdentStart): Changed this to take UChar32; no caller tries

to call it with a UChar, so no need to overload for that type for now.

(JSC::isNonLatin1IdentPart): Changed argument type to UChar32 for clarity.

Also added FIXME about a subtle ES6 change that we might want to make later.

(JSC::isIdentPart): Changed this to take UChar32; no caller tries

to call it with a UChar, so no need to overload for that type for now.

(JSC::isIdentPartIncludingEscapeTemplate): Made this a template so that we

don't need to repeat the code twice. Added code to handle code point escapes.

(JSC::isIdentPartIncludingEscape): Call the template instead of having the

code in line.

(JSC::Lexer&lt;CharacterType&gt;::recordUnicodeCodePoint): Added.

(JSC::Lexer&lt;CharacterType&gt;::parseIdentifierSlowCase): Made small tweaks and

updated to call parseUnicodeEscape instead of parseFourDigitUnicodeHex.

(JSC::Lexer&lt;CharacterType&gt;::parseComplexEscape): Call parseUnicodeEscape

instead of parseFourDigitUnicodeHex. Move the code to handle &quot;\u&quot; before

the code that handles the escapes, since the code point escape code now

consumes characters while parsing rather than peeking ahead. Test case

covers this: Symptom would be that &quot;\u{&quot; would evaluate to &quot;u&quot; instead of

giving a syntax error.

* parser/Lexer.h: Updated for above changes.

* runtime/StringConstructor.cpp:

(JSC::stringFromCodePoint): Use ICU's UCHAR_MAX_VALUE instead of writing

out 0x10FFFF; clearer this way.

Source/WebCore:

Test: js/unicode-escape-sequences.html

* css/CSSParser.cpp:

(WebCore::CSSParser::parseEscape): Use ICU's UCHAR_MAX_VALUE instead of writing

out 0x10FFFF; clearer this way. Also use our replacementCharacter instead of

writing out 0xFFFD.

* html/parser/HTMLEntityParser.cpp:

(WebCore::isAlphaNumeric): Deleted.

(WebCore::HTMLEntityParser::legalEntityFor): Use ICU's UCHAR_MAX_VALUE and

U_IS_SURROGATE instead of writing the code out. Didn't use U_IS_UNICODE_CHAR

because that also includes U_IS_UNICODE_NONCHAR and thus would change behavior,

but maye it's something we want to do in the future.

(WebCore::HTMLEntityParser::consumeNamedEntity): Use isASCIIAlphanumeric instead

of a the function in this file that does the same thing less efficiently.

* html/parser/InputStreamPreprocessor.h:

(WebCore::InputStreamPreprocessor::processNextInputCharacter): Use

replacementCharacter from CharacterNames.h instead of writing out 0xFFFd.

* xml/parser/CharacterReferenceParserInlines.h:

(WebCore::consumeCharacterReference): Use ICU's UCHAR_MAX_VALUE instead of

defining our own local highestValidCharacter constant.

LayoutTests:

* js/script-tests/unicode-escape-sequences.js: Added.

* js/unicode-escape-sequences-expected.txt: Added.

* js/unicode-escape-sequences.html: Added. Generated with make-script-test-wrappers.</pre>

<h3>Modified Paths</h3>

<ul>

<li><a href="#trunkLayoutTestsChangeLog">trunk/LayoutTests/ChangeLog</a></li>

<li><a href="#trunkSourceJavaScriptCoreChangeLog">trunk/Source/JavaScriptCore/ChangeLog</a></li>

<li><a href="#trunkSourceJavaScriptCoreparserLexercpp">trunk/Source/JavaScriptCore/parser/Lexer.cpp</a></li>

<li><a href="#trunkSourceJavaScriptCoreparserLexerh">trunk/Source/JavaScriptCore/parser/Lexer.h</a></li>

<li><a href="#trunkSourceJavaScriptCoreruntimeStringConstructorcpp">trunk/Source/JavaScriptCore/runtime/StringConstructor.cpp</a></li>

<li><a href="#trunkSourceWebCoreChangeLog">trunk/Source/WebCore/ChangeLog</a></li>

<li><a href="#trunkSourceWebCorecssCSSParsercpp">trunk/Source/WebCore/css/CSSParser.cpp</a></li>

<li><a href="#trunkSourceWebCorehtmlparserHTMLEntityParsercpp">trunk/Source/WebCore/html/parser/HTMLEntityParser.cpp</a></li>

<li><a href="#trunkSourceWebCorehtmlparserInputStreamPreprocessorh">trunk/Source/WebCore/html/parser/InputStreamPreprocessor.h</a></li>

<li><a href="#trunkSourceWebCorexmlparserCharacterReferenceParserInlinesh">trunk/Source/WebCore/xml/parser/CharacterReferenceParserInlines.h</a></li>

</ul>

<h3>Added Paths</h3>

<ul>

<li><a href="#trunkLayoutTestsjsscripttestsunicodeescapesequencesjs">trunk/LayoutTests/js/script-tests/unicode-escape-sequences.js</a></li>

<li><a href="#trunkLayoutTestsjsunicodeescapesequencesexpectedtxt">trunk/LayoutTests/js/unicode-escape-sequences-expected.txt</a></li>

<li><a href="#trunkLayoutTestsjsunicodeescapesequenceshtml">trunk/LayoutTests/js/unicode-escape-sequences.html</a></li>

</ul>

</div>

<div id="patch">

<h3>Diff</h3>

<a id="trunkLayoutTestsChangeLog"></a>

<div class="modfile"><h4>Modified: trunk/LayoutTests/ChangeLog (183551 => 183552)</h4>

<pre class="diff"><span>

<span class="info">--- trunk/LayoutTests/ChangeLog        2015-04-29 16:32:05 UTC (rev 183551)

+++ trunk/LayoutTests/ChangeLog        2015-04-29 16:33:12 UTC (rev 183552)

</span><span class="lines">@@ -1,3 +1,14 @@

</span><ins>+2015-04-29  Darin Adler  &lt;darin@apple.com&gt;

+

+        [ES6] Implement Unicode code point escapes

+        https://bugs.webkit.org/show_bug.cgi?id=144377

+

+        Reviewed by Antti Koivisto.

+

+        * js/script-tests/unicode-escape-sequences.js: Added.

+        * js/unicode-escape-sequences-expected.txt: Added.

+        * js/unicode-escape-sequences.html: Added. Generated with make-script-test-wrappers.

+

</ins><span class="cx"> 2015-04-29  Hyungwook Lee  &lt;hyungwook.lee@navercorp.com&gt;

</span><span class="cx"> 

<span class="cx">         Fix crash in WebCore::LogicalSelectionOffsetCaches::ContainingBlockInfo::setBlock().

</span></span></pre></div>

<a id="trunkLayoutTestsjsscripttestsunicodeescapesequencesjs"></a>

<div class="addfile"><h4>Added: trunk/LayoutTests/js/script-tests/unicode-escape-sequences.js (0 => 183552)</h4>

<pre class="diff"><span>

<span class="info">--- trunk/LayoutTests/js/script-tests/unicode-escape-sequences.js                                (rev 0)

+++ trunk/LayoutTests/js/script-tests/unicode-escape-sequences.js        2015-04-29 16:33:12 UTC (rev 183552)

</span><span class="lines">@@ -0,0 +1,138 @@

</span><ins>+description(&quot;Test of Unicode escape sequences in string literals and identifiers, especially code point escape sequences.&quot;);

+

+function codeUnits(string)

+{

+    var result = [];

+    for (var i = 0; i &lt; string.length; ++i) {

+        var hex = &quot;000&quot; + string.charCodeAt(i).toString(16).toUpperCase();

+        result.push(hex.substring(hex.length - 4));

+    }

+    return result.join(&quot;,&quot;);

+}

+

+function testStringUnicodeEscapeSequence(sequence, expectedResult)

+{

+    shouldBeEqualToString('codeUnits(&quot;\\u' + sequence + '&quot;)', expectedResult);

+}

+

+function testInvalidStringUnicodeEscapeSequence(sequence)

+{

+    shouldThrow('codeUnits(&quot;\\u' + sequence + '&quot;)');

+}

+

+function testIdentifierStartUnicodeEscapeSequence(sequence, expectedResult)

+{

+    shouldBeEqualToString('codeUnits(function \\u' + sequence + '(){}.name)', expectedResult);

+}

+

+function testInvalidIdentifierStartUnicodeEscapeSequence(sequence)

+{

+    shouldThrow('codeUnits(function \\u' + sequence + '(){}.name)');

+}

+

+function testIdentifierPartUnicodeEscapeSequence(sequence, expectedResult)

+{

+    shouldBeEqualToString('codeUnits(function x\\u' + sequence + '(){}.name.substring(1))', expectedResult);

+}

+

+function testInvalidIdentifierPartUnicodeEscapeSequence(sequence)

+{

+    shouldThrow('codeUnits(function x\\u' + sequence + '(){}.name.substring(1))');

+}

+

+testStringUnicodeEscapeSequence(&quot;&quot;, &quot;0075&quot;);

+testStringUnicodeEscapeSequence(&quot;{0}&quot;, &quot;0000&quot;);

+testStringUnicodeEscapeSequence(&quot;{41}&quot;, &quot;0041&quot;);

+testStringUnicodeEscapeSequence(&quot;{D800}&quot;, &quot;D800&quot;);

+testStringUnicodeEscapeSequence(&quot;{d800}&quot;, &quot;D800&quot;);

+testStringUnicodeEscapeSequence(&quot;{DC00}&quot;, &quot;DC00&quot;);

+testStringUnicodeEscapeSequence(&quot;{dc00}&quot;, &quot;DC00&quot;);

+testStringUnicodeEscapeSequence(&quot;{FFFF}&quot;, &quot;FFFF&quot;);

+testStringUnicodeEscapeSequence(&quot;{ffff}&quot;, &quot;FFFF&quot;);

+testStringUnicodeEscapeSequence(&quot;{10000}&quot;, &quot;D800,DC00&quot;);

+testStringUnicodeEscapeSequence(&quot;{10001}&quot;, &quot;D800,DC01&quot;);

+testStringUnicodeEscapeSequence(&quot;{102C0}&quot;, &quot;D800,DEC0&quot;);

+testStringUnicodeEscapeSequence(&quot;{102c0}&quot;, &quot;D800,DEC0&quot;);

+testStringUnicodeEscapeSequence(&quot;{1D306}&quot;, &quot;D834,DF06&quot;);

+testStringUnicodeEscapeSequence(&quot;{1d306}&quot;, &quot;D834,DF06&quot;);

+testStringUnicodeEscapeSequence(&quot;{10FFFE}&quot;, &quot;DBFF,DFFE&quot;);

+testStringUnicodeEscapeSequence(&quot;{10fffe}&quot;, &quot;DBFF,DFFE&quot;);

+testStringUnicodeEscapeSequence(&quot;{10FFFF}&quot;, &quot;DBFF,DFFF&quot;);

+testStringUnicodeEscapeSequence(&quot;{10ffff}&quot;, &quot;DBFF,DFFF&quot;);

+testStringUnicodeEscapeSequence(&quot;{00000000000000000000000010FFFF}&quot;, &quot;DBFF,DFFF&quot;);

+testStringUnicodeEscapeSequence(&quot;{00000000000000000000000010ffff}&quot;, &quot;DBFF,DFFF&quot;);

+

+testInvalidStringUnicodeEscapeSequence(&quot;x&quot;);

+testInvalidStringUnicodeEscapeSequence(&quot;{&quot;);

+testInvalidStringUnicodeEscapeSequence(&quot;{}&quot;);

+testInvalidStringUnicodeEscapeSequence(&quot;{G}&quot;);

+testInvalidStringUnicodeEscapeSequence(&quot;{1G}&quot;);

+testInvalidStringUnicodeEscapeSequence(&quot;{110000}&quot;);

+testInvalidStringUnicodeEscapeSequence(&quot;{1000000}&quot;);

+testInvalidStringUnicodeEscapeSequence(&quot;{100000000000000000000000}&quot;);

+

+testIdentifierStartUnicodeEscapeSequence(&quot;{41}&quot;, &quot;0041&quot;);

+testIdentifierStartUnicodeEscapeSequence(&quot;{102C0}&quot;, &quot;D800,DEC0&quot;);

+testIdentifierStartUnicodeEscapeSequence(&quot;{102c0}&quot;, &quot;D800,DEC0&quot;);

+testIdentifierStartUnicodeEscapeSequence(&quot;{1D306}&quot;, &quot;D834,DF06&quot;);

+testIdentifierStartUnicodeEscapeSequence(&quot;{1d306}&quot;, &quot;D834,DF06&quot;);

+

+testInvalidIdentifierStartUnicodeEscapeSequence(&quot;&quot;);

+testInvalidIdentifierStartUnicodeEscapeSequence(&quot;{0}&quot;);

+testInvalidIdentifierStartUnicodeEscapeSequence(&quot;{D800}&quot;);

+testInvalidIdentifierStartUnicodeEscapeSequence(&quot;{d800}&quot;);

+testInvalidIdentifierStartUnicodeEscapeSequence(&quot;{DC00}&quot;);

+testInvalidIdentifierStartUnicodeEscapeSequence(&quot;{dc00}&quot;);

+testInvalidIdentifierStartUnicodeEscapeSequence(&quot;{FFFF}&quot;);

+testInvalidIdentifierStartUnicodeEscapeSequence(&quot;{ffff}&quot;);

+testInvalidIdentifierStartUnicodeEscapeSequence(&quot;{10000}&quot;);

+testInvalidIdentifierStartUnicodeEscapeSequence(&quot;{10001}&quot;);

+testInvalidIdentifierStartUnicodeEscapeSequence(&quot;{10FFFE}&quot;);

+testInvalidIdentifierStartUnicodeEscapeSequence(&quot;{10fffe}&quot;);

+testInvalidIdentifierStartUnicodeEscapeSequence(&quot;{10FFFF}&quot;);

+testInvalidIdentifierStartUnicodeEscapeSequence(&quot;{10ffff}&quot;);

+testInvalidIdentifierStartUnicodeEscapeSequence(&quot;{00000000000000000000000010FFFF}&quot;);

+testInvalidIdentifierStartUnicodeEscapeSequence(&quot;{00000000000000000000000010ffff}&quot;);

+

+testInvalidIdentifierStartUnicodeEscapeSequence(&quot;x&quot;);

+testInvalidIdentifierStartUnicodeEscapeSequence(&quot;{&quot;);

+testInvalidIdentifierStartUnicodeEscapeSequence(&quot;{}&quot;);

+testInvalidIdentifierStartUnicodeEscapeSequence(&quot;{G}&quot;);

+testInvalidIdentifierStartUnicodeEscapeSequence(&quot;{1G}&quot;);

+testInvalidIdentifierStartUnicodeEscapeSequence(&quot;{110000}&quot;);

+testInvalidIdentifierStartUnicodeEscapeSequence(&quot;{1000000}&quot;);

+testInvalidIdentifierStartUnicodeEscapeSequence(&quot;{100000000000000000000000}&quot;);

+

+testIdentifierPartUnicodeEscapeSequence(&quot;{41}&quot;, &quot;0041&quot;);

+testIdentifierPartUnicodeEscapeSequence(&quot;{10000}&quot;, &quot;D800,DC00&quot;);

+testIdentifierPartUnicodeEscapeSequence(&quot;{10001}&quot;, &quot;D800,DC01&quot;);

+testIdentifierPartUnicodeEscapeSequence(&quot;{102C0}&quot;, &quot;D800,DEC0&quot;);

+testIdentifierPartUnicodeEscapeSequence(&quot;{102c0}&quot;, &quot;D800,DEC0&quot;);

+

+testInvalidIdentifierPartUnicodeEscapeSequence(&quot;&quot;);

+testInvalidIdentifierPartUnicodeEscapeSequence(&quot;{0}&quot;);

+testInvalidIdentifierPartUnicodeEscapeSequence(&quot;{D800}&quot;);

+testInvalidIdentifierPartUnicodeEscapeSequence(&quot;{d800}&quot;);

+testInvalidIdentifierPartUnicodeEscapeSequence(&quot;{DC00}&quot;);

+testInvalidIdentifierPartUnicodeEscapeSequence(&quot;{dc00}&quot;);

+testInvalidIdentifierPartUnicodeEscapeSequence(&quot;{FFFF}&quot;);

+testInvalidIdentifierPartUnicodeEscapeSequence(&quot;{ffff}&quot;);

+testInvalidIdentifierPartUnicodeEscapeSequence(&quot;{1D306}&quot;);

+testInvalidIdentifierPartUnicodeEscapeSequence(&quot;{1d306}&quot;);

+testInvalidIdentifierPartUnicodeEscapeSequence(&quot;{10FFFE}&quot;);

+testInvalidIdentifierPartUnicodeEscapeSequence(&quot;{10fffe}&quot;);

+testInvalidIdentifierPartUnicodeEscapeSequence(&quot;{10FFFF}&quot;);

+testInvalidIdentifierPartUnicodeEscapeSequence(&quot;{10ffff}&quot;);

+testInvalidIdentifierPartUnicodeEscapeSequence(&quot;{00000000000000000000000010FFFF}&quot;);

+testInvalidIdentifierPartUnicodeEscapeSequence(&quot;{00000000000000000000000010ffff}&quot;);

+

+testInvalidIdentifierPartUnicodeEscapeSequence(&quot;x&quot;);

+testInvalidIdentifierPartUnicodeEscapeSequence(&quot;{&quot;);

+testInvalidIdentifierPartUnicodeEscapeSequence(&quot;{}&quot;);

+testInvalidIdentifierPartUnicodeEscapeSequence(&quot;{G}&quot;);

+testInvalidIdentifierPartUnicodeEscapeSequence(&quot;{1G}&quot;);

+testInvalidIdentifierPartUnicodeEscapeSequence(&quot;{110000}&quot;);

+testInvalidIdentifierPartUnicodeEscapeSequence(&quot;{1000000}&quot;);

+testInvalidIdentifierPartUnicodeEscapeSequence(&quot;{100000000000000000000000}&quot;);

+

+var successfullyParsed = true;

</ins></span></pre></div>

<a id="trunkLayoutTestsjsunicodeescapesequencesexpectedtxt"></a>

<div class="addfile"><h4>Added: trunk/LayoutTests/js/unicode-escape-sequences-expected.txt (0 => 183552)</h4>

<pre class="diff"><span>

<span class="info">--- trunk/LayoutTests/js/unicode-escape-sequences-expected.txt                                (rev 0)

+++ trunk/LayoutTests/js/unicode-escape-sequences-expected.txt        2015-04-29 16:33:12 UTC (rev 183552)

</span><span class="lines">@@ -0,0 +1,96 @@

<ins>+Test of Unicode escape sequences in string literals and identifiers, especially code point escape sequences.

+

+On success, you will see a series of &quot;PASS&quot; messages, followed by &quot;TEST COMPLETE&quot;.

+

+

+PASS codeUnits(&quot;\u&quot;) is &quot;0075&quot;

+PASS codeUnits(&quot;\u{0}&quot;) is &quot;0000&quot;

+PASS codeUnits(&quot;\u{41}&quot;) is &quot;0041&quot;

+PASS codeUnits(&quot;\u{D800}&quot;) is &quot;D800&quot;

+PASS codeUnits(&quot;\u{d800}&quot;) is &quot;D800&quot;

+PASS codeUnits(&quot;\u{DC00}&quot;) is &quot;DC00&quot;

+PASS codeUnits(&quot;\u{dc00}&quot;) is &quot;DC00&quot;

+PASS codeUnits(&quot;\u{FFFF}&quot;) is &quot;FFFF&quot;

+PASS codeUnits(&quot;\u{ffff}&quot;) is &quot;FFFF&quot;

+PASS codeUnits(&quot;\u{10000}&quot;) is &quot;D800,DC00&quot;

+PASS codeUnits(&quot;\u{10001}&quot;) is &quot;D800,DC01&quot;

+PASS codeUnits(&quot;\u{102C0}&quot;) is &quot;D800,DEC0&quot;

+PASS codeUnits(&quot;\u{102c0}&quot;) is &quot;D800,DEC0&quot;

+PASS codeUnits(&quot;\u{1D306}&quot;) is &quot;D834,DF06&quot;

+PASS codeUnits(&quot;\u{1d306}&quot;) is &quot;D834,DF06&quot;

+PASS codeUnits(&quot;\u{10FFFE}&quot;) is &quot;DBFF,DFFE&quot;

+PASS codeUnits(&quot;\u{10fffe}&quot;) is &quot;DBFF,DFFE&quot;

+PASS codeUnits(&quot;\u{10FFFF}&quot;) is &quot;DBFF,DFFF&quot;

+PASS codeUnits(&quot;\u{10ffff}&quot;) is &quot;DBFF,DFFF&quot;

+PASS codeUnits(&quot;\u{00000000000000000000000010FFFF}&quot;) is &quot;DBFF,DFFF&quot;

+PASS codeUnits(&quot;\u{00000000000000000000000010ffff}&quot;) is &quot;DBFF,DFFF&quot;

+PASS codeUnits(&quot;\ux&quot;) threw exception SyntaxError: \u can only be followed by a Unicode character sequence.

+PASS codeUnits(&quot;\u{&quot;) threw exception SyntaxError: \u can only be followed by a Unicode character sequence.

+PASS codeUnits(&quot;\u{}&quot;) threw exception SyntaxError: \u can only be followed by a Unicode character sequence.

+PASS codeUnits(&quot;\u{G}&quot;) threw exception SyntaxError: \u can only be followed by a Unicode character sequence.

+PASS codeUnits(&quot;\u{1G}&quot;) threw exception SyntaxError: \u can only be followed by a Unicode character sequence.

+PASS codeUnits(&quot;\u{110000}&quot;) threw exception SyntaxError: \u can only be followed by a Unicode character sequence.

+PASS codeUnits(&quot;\u{1000000}&quot;) threw exception SyntaxError: \u can only be followed by a Unicode character sequence.

+PASS codeUnits(&quot;\u{100000000000000000000000}&quot;) threw exception SyntaxError: \u can only be followed by a Unicode character sequence.

+PASS codeUnits(function \u{41}(){}.name) is &quot;0041&quot;

+PASS codeUnits(function \u{102C0}(){}.name) is &quot;D800,DEC0&quot;

+PASS codeUnits(function \u{102c0}(){}.name) is &quot;D800,DEC0&quot;

+PASS codeUnits(function \u{1D306}(){}.name) is &quot;D834,DF06&quot;

+PASS codeUnits(function \u{1d306}(){}.name) is &quot;D834,DF06&quot;

+PASS codeUnits(function \u(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u'.

+PASS codeUnits(function \u{0}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{0}'.

+PASS codeUnits(function \u{D800}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{D800}'.

+PASS codeUnits(function \u{d800}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{d800}'.

+PASS codeUnits(function \u{DC00}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{DC00}'.

+PASS codeUnits(function \u{dc00}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{dc00}'.

+PASS codeUnits(function \u{FFFF}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{FFFF}'.

+PASS codeUnits(function \u{ffff}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{ffff}'.

+PASS codeUnits(function \u{10000}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{10000}'.

+PASS codeUnits(function \u{10001}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{10001}'.

+PASS codeUnits(function \u{10FFFE}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{10FFFE}'.

+PASS codeUnits(function \u{10fffe}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{10fffe}'.

+PASS codeUnits(function \u{10FFFF}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{10FFFF}'.

+PASS codeUnits(function \u{10ffff}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{10ffff}'.

+PASS codeUnits(function \u{00000000000000000000000010FFFF}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{00000000000000000000000010FFFF}'.

+PASS codeUnits(function \u{00000000000000000000000010ffff}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{00000000000000000000000010ffff}'.

+PASS codeUnits(function \ux(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u'.

+PASS codeUnits(function \u{(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{'.

+PASS codeUnits(function \u{}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{'.

+PASS codeUnits(function \u{G}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{'.

+PASS codeUnits(function \u{1G}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{1'.

+PASS codeUnits(function \u{110000}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{11000'.

+PASS codeUnits(function \u{1000000}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{100000'.

+PASS codeUnits(function \u{100000000000000000000000}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{100000'.

+PASS codeUnits(function x\u{41}(){}.name.substring(1)) is &quot;0041&quot;

+PASS codeUnits(function x\u{10000}(){}.name.substring(1)) is &quot;D800,DC00&quot;

+PASS codeUnits(function x\u{10001}(){}.name.substring(1)) is &quot;D800,DC01&quot;

+PASS codeUnits(function x\u{102C0}(){}.name.substring(1)) is &quot;D800,DEC0&quot;

+PASS codeUnits(function x\u{102c0}(){}.name.substring(1)) is &quot;D800,DEC0&quot;

+PASS codeUnits(function x\u(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u'.

+PASS codeUnits(function x\u{0}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{0}'.

+PASS codeUnits(function x\u{D800}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{D800}'.

+PASS codeUnits(function x\u{d800}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{d800}'.

+PASS codeUnits(function x\u{DC00}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{DC00}'.

+PASS codeUnits(function x\u{dc00}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{dc00}'.

+PASS codeUnits(function x\u{FFFF}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{FFFF}'.

+PASS codeUnits(function x\u{ffff}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{ffff}'.

+PASS codeUnits(function x\u{1D306}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{1D306}'.

+PASS codeUnits(function x\u{1d306}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{1d306}'.

+PASS codeUnits(function x\u{10FFFE}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{10FFFE}'.

+PASS codeUnits(function x\u{10fffe}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{10fffe}'.

+PASS codeUnits(function x\u{10FFFF}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{10FFFF}'.

+PASS codeUnits(function x\u{10ffff}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{10ffff}'.

+PASS codeUnits(function x\u{00000000000000000000000010FFFF}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{00000000000000000000000010FFFF}'.

+PASS codeUnits(function x\u{00000000000000000000000010ffff}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{00000000000000000000000010ffff}'.

+PASS codeUnits(function x\ux(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u'.

+PASS codeUnits(function x\u{(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{'.

+PASS codeUnits(function x\u{}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{'.

+PASS codeUnits(function x\u{G}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{'.

+PASS codeUnits(function x\u{1G}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{1'.

+PASS codeUnits(function x\u{110000}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{11000'.

+PASS codeUnits(function x\u{1000000}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{100000'.

+PASS codeUnits(function x\u{100000000000000000000000}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{100000'.

+PASS successfullyParsed is true

+

+TEST COMPLETE

+

</ins><span class="cx">Property changes on: trunk/LayoutTests/js/unicode-escape-sequences-expected.txt

</span><span class="cx">___________________________________________________________________

</span></span></pre></div>

<a id="svneolstyle"></a>

<div class="addfile"><h4>Added: svn:eol-style</h4></div>

<a id="trunkLayoutTestsjsunicodeescapesequenceshtml"></a>

<div class="addfile"><h4>Added: trunk/LayoutTests/js/unicode-escape-sequences.html (0 => 183552)</h4>

<pre class="diff"><span>

<span class="info">--- trunk/LayoutTests/js/unicode-escape-sequences.html                                (rev 0)

+++ trunk/LayoutTests/js/unicode-escape-sequences.html        2015-04-29 16:33:12 UTC (rev 183552)

</span><span class="lines">@@ -0,0 +1,8 @@

</span><ins>+&lt;!DOCTYPE html&gt;

+&lt;html&gt;

+&lt;body&gt;

+&lt;script src=&quot;../resources/js-test-pre.js&quot;&gt;&lt;/script&gt;

+&lt;script src=&quot;script-tests/unicode-escape-sequences.js&quot;&gt;&lt;/script&gt;

+&lt;script src=&quot;../resources/js-test-post.js&quot;&gt;&lt;/script&gt;

+&lt;/body&gt;

+&lt;/html&gt;

</ins><span class="cx">Property changes on: trunk/LayoutTests/js/unicode-escape-sequences.html

</span><span class="cx">___________________________________________________________________

</span></span></pre></div>

<a id="svnmimetype"></a>

<div class="addfile"><h4>Added: svn:mime-type</h4></div>

<a id="svneolstyle"></a>

<div class="addfile"><h4>Added: svn:eol-style</h4></div>

<a id="trunkSourceJavaScriptCoreChangeLog"></a>

<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/ChangeLog (183551 => 183552)</h4>

<pre class="diff"><span>

<span class="info">--- trunk/Source/JavaScriptCore/ChangeLog        2015-04-29 16:32:05 UTC (rev 183551)

+++ trunk/Source/JavaScriptCore/ChangeLog        2015-04-29 16:33:12 UTC (rev 183552)

</span><span class="lines">@@ -1,3 +1,45 @@

</span><ins>+2015-04-29  Darin Adler  &lt;darin@apple.com&gt;

+

+        [ES6] Implement Unicode code point escapes

+        https://bugs.webkit.org/show_bug.cgi?id=144377

+

+        Reviewed by Antti Koivisto.

+

+        * parser/Lexer.cpp: Moved the UnicodeHexValue class in here from

+        the header. Made it a non-member class so it doesn't need to be part

+        of a template. Made it use UChar32 instead of int for the value to

+        make it clearer what goes into this class.

+        (JSC::ParsedUnicodeEscapeValue::isIncomplete): Added. Replaces the

+        old type() function.

+        (JSC::Lexer&lt;CharacterType&gt;::parseUnicodeEscape): Renamed from

+        parseFourDigitUnicodeHex and added support for code point escapes.

+        (JSC::isLatin1): Added an overload for UChar32.

+        (JSC::isIdentStart): Changed this to take UChar32; no caller tries

+        to call it with a UChar, so no need to overload for that type for now.

+        (JSC::isNonLatin1IdentPart): Changed argument type to UChar32 for clarity.

+        Also added FIXME about a subtle ES6 change that we might want to make later.

+        (JSC::isIdentPart): Changed this to take UChar32; no caller tries

+        to call it with a UChar, so no need to overload for that type for now.

+        (JSC::isIdentPartIncludingEscapeTemplate): Made this a template so that we

+        don't need to repeat the code twice. Added code to handle code point escapes.

+        (JSC::isIdentPartIncludingEscape): Call the template instead of having the

+        code in line.

+        (JSC::Lexer&lt;CharacterType&gt;::recordUnicodeCodePoint): Added.

+        (JSC::Lexer&lt;CharacterType&gt;::parseIdentifierSlowCase): Made small tweaks and

+        updated to call parseUnicodeEscape instead of parseFourDigitUnicodeHex.

+        (JSC::Lexer&lt;CharacterType&gt;::parseComplexEscape): Call parseUnicodeEscape

+        instead of parseFourDigitUnicodeHex. Move the code to handle &quot;\u&quot; before

+        the code that handles the escapes, since the code point escape code now

+        consumes characters while parsing rather than peeking ahead. Test case

+        covers this: Symptom would be that &quot;\u{&quot; would evaluate to &quot;u&quot; instead of

+        giving a syntax error.

+

+        * parser/Lexer.h: Updated for above changes.

+

+        * runtime/StringConstructor.cpp:

+        (JSC::stringFromCodePoint): Use ICU's UCHAR_MAX_VALUE instead of writing

+        out 0x10FFFF; clearer this way.

+

</ins><span class="cx"> 2015-04-29  Martin Robinson  &lt;mrobinson@igalia.com&gt;

</span><span class="cx"> 

</span><span class="cx">         [CMake] [GTK] Organize and clean up unused CMake variables

</span></span></pre></div>

<a id="trunkSourceJavaScriptCoreparserLexercpp"></a>

<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/parser/Lexer.cpp (183551 => 183552)</h4>

<pre class="diff"><span>

<span class="info">--- trunk/Source/JavaScriptCore/parser/Lexer.cpp        2015-04-29 16:32:05 UTC (rev 183551)

+++ trunk/Source/JavaScriptCore/parser/Lexer.cpp        2015-04-29 16:33:12 UTC (rev 183552)

</span><span class="lines">@@ -610,22 +610,60 @@

</span><span class="cx">     return (code &lt; m_codeEnd) ? *code : 0;

</span><span class="cx"> }

</span><span class="cx"> 

</span><del>-template &lt;typename T&gt;

-typename Lexer&lt;T&gt;::UnicodeHexValue Lexer&lt;T&gt;::parseFourDigitUnicodeHex()

</del><ins>+struct ParsedUnicodeEscapeValue {

+    ParsedUnicodeEscapeValue(UChar32 value)

+        : m_value(value)

+    {

+        ASSERT(isValid());

+    }

+

+    enum SpecialValueType { Incomplete = -2, Invalid = -1 };

+    ParsedUnicodeEscapeValue(SpecialValueType type)

+        : m_value(type)

+    {

+    }

+

+    bool isValid() const { return m_value &gt;= 0; }

+    bool isIncomplete() const { return m_value == Incomplete; }

+

+    UChar32 value() const

+    {

+        ASSERT(isValid());

+        return m_value;

+    }

+

+private:

+    UChar32 m_value;

+};

+

+template&lt;typename CharacterType&gt; ParsedUnicodeEscapeValue Lexer&lt;CharacterType&gt;::parseUnicodeEscape()

</ins><span class="cx"> {

</span><del>-    T char1 = peek(1);

-    T char2 = peek(2);

-    T char3 = peek(3);

</del><ins>+    if (m_current == '{') {

+        shift();

+        UChar32 codePoint = 0;

+        do {

+            if (!isASCIIHexDigit(m_current))

+                return m_current ? ParsedUnicodeEscapeValue::Invalid : ParsedUnicodeEscapeValue::Incomplete;

+            codePoint = (codePoint &lt;&lt; 4) | toASCIIHexValue(m_current);

+            if (codePoint &gt; UCHAR_MAX_VALUE)

+                return ParsedUnicodeEscapeValue::Invalid;

+            shift();

+        } while (m_current != '}');

+        shift();

+        return codePoint;

+    }

</ins><span class="cx"> 

</span><del>-    if (UNLIKELY(!isASCIIHexDigit(m_current) || !isASCIIHexDigit(char1) || !isASCIIHexDigit(char2) || !isASCIIHexDigit(char3)))

-        return UnicodeHexValue((m_code + 4) &gt;= m_codeEnd ? UnicodeHexValue::IncompleteHex : UnicodeHexValue::InvalidHex);

-

-    int result = convertUnicode(m_current, char1, char2, char3);

</del><ins>+    auto character2 = peek(1);

+    auto character3 = peek(2);

+    auto character4 = peek(3);

+    if (UNLIKELY(!isASCIIHexDigit(m_current) || !isASCIIHexDigit(character2) || !isASCIIHexDigit(character3) || !isASCIIHexDigit(character4)))

+        return (m_code + 4) &gt;= m_codeEnd ? ParsedUnicodeEscapeValue::Incomplete : ParsedUnicodeEscapeValue::Invalid;

+    auto result = convertUnicode(m_current, character2, character3, character4);

</ins><span class="cx">     shift();

</span><span class="cx">     shift();

</span><span class="cx">     shift();

</span><span class="cx">     shift();

</span><del>-    return UnicodeHexValue(result);

</del><ins>+    return result;

</ins><span class="cx"> }

</span><span class="cx"> 

</span><span class="cx"> template &lt;typename T&gt;

</span><span class="lines">@@ -665,18 +703,24 @@

</span><span class="cx">     return c &lt; 256;

</span><span class="cx"> }

</span><span class="cx"> 

</span><ins>+static ALWAYS_INLINE bool isLatin1(UChar32 c)

+{

+    return !(c &amp; ~0xFF);

+}

+

</ins><span class="cx"> static inline bool isIdentStart(LChar c)

</span><span class="cx"> {

</span><span class="cx">     return typesOfLatin1Characters[c] == CharacterIdentifierStart;

</span><span class="cx"> }

</span><span class="cx"> 

</span><del>-static inline bool isIdentStart(UChar c)

</del><ins>+static inline bool isIdentStart(UChar32 c)

</ins><span class="cx"> {

</span><span class="cx">     return isLatin1(c) ? isIdentStart(static_cast&lt;LChar&gt;(c)) : isNonLatin1IdentStart(c);

</span><span class="cx"> }

</span><span class="cx"> 

</span><del>-static NEVER_INLINE bool isNonLatin1IdentPart(int c)

</del><ins>+static NEVER_INLINE bool isNonLatin1IdentPart(UChar32 c)

</ins><span class="cx"> {

<ins>+    // FIXME: ES6 says this should be based on the Unicode property ID_Continue now instead.

</ins><span class="cx">     return (U_GET_GC_MASK(c) &amp; (U_GC_L_MASK | U_GC_MN_MASK | U_GC_MC_MASK | U_GC_ND_MASK | U_GC_PC_MASK)) || c == 0x200C || c == 0x200D;

</span><span class="cx"> }

</span><span class="cx"> 

</span><span class="lines">@@ -688,39 +732,59 @@

</span><span class="cx">     return typesOfLatin1Characters[c] &lt;= CharacterNumber;

</span><span class="cx"> }

</span><span class="cx"> 

</span><del>-static ALWAYS_INLINE bool isIdentPart(UChar c)

</del><ins>+static ALWAYS_INLINE bool isIdentPart(UChar32 c)

</ins><span class="cx"> {

</span><span class="cx">     return isLatin1(c) ? isIdentPart(static_cast&lt;LChar&gt;(c)) : isNonLatin1IdentPart(c);

</span><span class="cx"> }

</span><span class="cx"> 

</span><del>-template &lt;typename T&gt;

-bool isUnicodeEscapeIdentPart(const T* code)

</del><ins>+static ALWAYS_INLINE bool isIdentPart(UChar c)

</ins><span class="cx"> {

</span><del>-    T char1 = code[0];

-    T char2 = code[1];

-    T char3 = code[2];

-    T char4 = code[3];

-    

-    if (!isASCIIHexDigit(char1) || !isASCIIHexDigit(char2) || !isASCIIHexDigit(char3) || !isASCIIHexDigit(char4))

-        return false;

-    

-    return isIdentPart(Lexer&lt;T&gt;::convertUnicode(char1, char2, char3, char4));

</del><ins>+    return isIdentPart(static_cast&lt;UChar32&gt;(c));

</ins><span class="cx"> }

</span><span class="cx"> 

</span><del>-static ALWAYS_INLINE bool isIdentPartIncludingEscape(const LChar* code, const LChar* codeEnd)

</del><ins>+template&lt;typename CharacterType&gt; ALWAYS_INLINE bool isIdentPartIncludingEscapeTemplate(const CharacterType* code, const CharacterType* codeEnd)

</ins><span class="cx"> {

</span><del>-    if (isIdentPart(*code))

</del><ins>+    if (isIdentPart(code[0]))

</ins><span class="cx">         return true;

</span><span class="cx"> 

</span><del>-    return (*code == '\\' &amp;&amp; ((codeEnd - code) &gt;= 6) &amp;&amp; code[1] == 'u' &amp;&amp; isUnicodeEscapeIdentPart(code+2));

</del><ins>+    // Shortest sequence handled below is \u{0}, which is 5 characters.

+    if (!(code[0] == '\\' &amp;&amp; codeEnd - code &gt;= 5 &amp;&amp; code[1] == 'u'))

+        return false;

+

+    if (code[2] == '{') {

+        UChar32 codePoint = 0;

+        const CharacterType* pointer;

+        for (pointer = &amp;code[3]; pointer &lt; codeEnd; ++pointer) {

+            auto digit = *pointer;

+            if (!isASCIIHexDigit(digit))

+                break;

+            codePoint = (codePoint &lt;&lt; 4) | toASCIIHexValue(digit);

+            if (codePoint &gt; UCHAR_MAX_VALUE)

+                return false;

+        }

+        return isIdentPart(codePoint) &amp;&amp; pointer &lt; codeEnd &amp;&amp; *pointer == '}';

+    }

+

+    // Shortest sequence handled below is \uXXXX, which is 6 characters.

+    if (codeEnd - code &lt; 6)

+        return false;

+

+    auto character1 = code[2];

+    auto character2 = code[3];

+    auto character3 = code[4];

+    auto character4 = code[5];

+    return isASCIIHexDigit(character1) &amp;&amp; isASCIIHexDigit(character2) &amp;&amp; isASCIIHexDigit(character3) &amp;&amp; isASCIIHexDigit(character4)

+        &amp;&amp; isIdentPart(Lexer&lt;LChar&gt;::convertUnicode(character1, character2, character3, character4));

</ins><span class="cx"> }

</span><span class="cx"> 

</span><ins>+static ALWAYS_INLINE bool isIdentPartIncludingEscape(const LChar* code, const LChar* codeEnd)

+{

+    return isIdentPartIncludingEscapeTemplate(code, codeEnd);

+}

+

</ins><span class="cx"> static ALWAYS_INLINE bool isIdentPartIncludingEscape(const UChar* code, const UChar* codeEnd)

</span><span class="cx"> {

</span><del>-    if (isIdentPart(*code))

-        return true;

-    

-    return (*code == '\\' &amp;&amp; ((codeEnd - code) &gt;= 6) &amp;&amp; code[1] == 'u' &amp;&amp; isUnicodeEscapeIdentPart(code+2));

</del><ins>+    return isIdentPartIncludingEscapeTemplate(code, codeEnd);

</ins><span class="cx"> }

</span><span class="cx"> 

</span><span class="cx"> static inline LChar singleEscape(int c)

</span><span class="lines">@@ -799,6 +863,18 @@

</span><span class="cx">     m_buffer16.append(static_cast&lt;UChar&gt;(c));

</span><span class="cx"> }

</span><span class="cx">     

</span><ins>+template&lt;typename CharacterType&gt; inline void Lexer&lt;CharacterType&gt;::recordUnicodeCodePoint(UChar32 codePoint)

+{

+    ASSERT(codePoint &gt;= 0);

+    ASSERT(codePoint &lt;= UCHAR_MAX_VALUE);

+    if (U_IS_BMP(codePoint))

+        record16(codePoint);

+    else {

+        UChar codeUnits[2] = { U16_LEAD(codePoint), U16_TRAIL(codePoint) };

+        append16(codeUnits, 2);

+    }

+}

+

</ins><span class="cx"> #if !ASSERT_DISABLED

</span><span class="cx"> bool isSafeBuiltinIdentifier(VM&amp; vm, const Identifier* ident)

</span><span class="cx"> {

</span><span class="lines">@@ -807,6 +883,7 @@

</span><span class="cx">     /* Just block any use of suspicious identifiers.  This is intended to

<span class="cx">      * be used as a safety net while implementing builtins.

</span><span class="cx">      */

<ins>+    // FIXME: How can a debug-only assertion be a safety net?

</ins><span class="cx">     if (*ident == vm.propertyNames-&gt;builtinNames().callPublicName())

</span><span class="cx">         return false;

</span><span class="cx">     if (*ident == vm.propertyNames-&gt;builtinNames().applyPublicName())

</span><span class="lines">@@ -960,11 +1037,10 @@

</span><span class="cx">     return IDENT;

</span><span class="cx"> }

</span><span class="cx"> 

</span><del>-template &lt;typename T&gt;

-template &lt;bool shouldCreateIdentifier&gt; JSTokenType Lexer&lt;T&gt;::parseIdentifierSlowCase(JSTokenData* tokenData, unsigned lexerFlags, bool strictMode)

</del><ins>+template&lt;typename CharacterType&gt; template&lt;bool shouldCreateIdentifier&gt; JSTokenType Lexer&lt;CharacterType&gt;::parseIdentifierSlowCase(JSTokenData* tokenData, unsigned lexerFlags, bool strictMode)

</ins><span class="cx"> {

</span><span class="cx">     const ptrdiff_t remaining = m_codeEnd - m_code;

</span><del>-    const T* identifierStart = currentSourcePtr();

</del><ins>+    auto identifierStart = currentSourcePtr();

</ins><span class="cx">     bool bufferRequired = false;

</span><span class="cx"> 

</span><span class="cx">     while (true) {

</span><span class="lines">@@ -983,19 +1059,18 @@

</span><span class="cx">         if (UNLIKELY(m_current != 'u'))

</span><span class="cx">             return atEnd() ? UNTERMINATED_IDENTIFIER_ESCAPE_ERRORTOK : INVALID_IDENTIFIER_ESCAPE_ERRORTOK;

</span><span class="cx">         shift();

</span><del>-        UnicodeHexValue character = parseFourDigitUnicodeHex();

</del><ins>+        auto character = parseUnicodeEscape();

</ins><span class="cx">         if (UNLIKELY(!character.isValid()))

</span><del>-            return character.valueType() == UnicodeHexValue::IncompleteHex ? UNTERMINATED_IDENTIFIER_UNICODE_ESCAPE_ERRORTOK : INVALID_IDENTIFIER_UNICODE_ESCAPE_ERRORTOK;

-        UChar ucharacter = static_cast&lt;UChar&gt;(character.value());

-        if (UNLIKELY(m_buffer16.size() ? !isIdentPart(ucharacter) : !isIdentStart(ucharacter)))

</del><ins>+            return character.isIncomplete() ? UNTERMINATED_IDENTIFIER_UNICODE_ESCAPE_ERRORTOK : INVALID_IDENTIFIER_UNICODE_ESCAPE_ERRORTOK;

+        if (UNLIKELY(m_buffer16.size() ? !isIdentPart(character.value()) : !isIdentStart(character.value())))

</ins><span class="cx">             return INVALID_IDENTIFIER_UNICODE_ESCAPE_ERRORTOK;

</span><span class="cx">         if (shouldCreateIdentifier)

</span><del>-            record16(ucharacter);

</del><ins>+            recordUnicodeCodePoint(character.value());

</ins><span class="cx">         identifierStart = currentSourcePtr();

</span><span class="cx">     }

</span><span class="cx"> 

</span><span class="cx">     int identifierLength;

</span><del>-    const Identifier* ident = 0;

</del><ins>+    const Identifier* ident = nullptr;

</ins><span class="cx">     if (shouldCreateIdentifier) {

</span><span class="cx">         if (!bufferRequired) {

</span><span class="cx">             identifierLength = currentSourcePtr() - identifierStart;

</span><span class="lines">@@ -1008,7 +1083,7 @@

</span><span class="cx"> 

</span><span class="cx">         tokenData-&gt;ident = ident;

</span><span class="cx">     } else

</span><del>-        tokenData-&gt;ident = 0;

</del><ins>+        tokenData-&gt;ident = nullptr;

</ins><span class="cx"> 

</span><span class="cx">     if (LIKELY(!bufferRequired &amp;&amp; !(lexerFlags &amp; LexerFlagsIgnoreReservedWords))) {

</span><span class="cx">         ASSERT(shouldCreateIdentifier);

</span><span class="lines">@@ -1125,21 +1200,22 @@

</span><span class="cx"> 

</span><span class="cx">     if (m_current == 'u') {

</span><span class="cx">         shift();

</span><del>-        UnicodeHexValue character = parseFourDigitUnicodeHex();

-        if (character.isValid()) {

</del><ins>+

+        if (escapeParseMode == EscapeParseMode::String &amp;&amp; m_current == stringQuoteCharacter) {

</ins><span class="cx">             if (shouldBuildStrings)

</span><del>-                record16(character.value());

</del><ins>+                record16('u');

</ins><span class="cx">             return StringParsedSuccessfully;

</span><span class="cx">         }

</span><span class="cx"> 

</span><del>-        if (escapeParseMode == EscapeParseMode::String &amp;&amp; m_current == stringQuoteCharacter) {

</del><ins>+        auto character = parseUnicodeEscape();

+        if (character.isValid()) {

</ins><span class="cx">             if (shouldBuildStrings)

</span><del>-                record16('u');

</del><ins>+                recordUnicodeCodePoint(character.value());

</ins><span class="cx">             return StringParsedSuccessfully;

</span><span class="cx">         }

</span><span class="cx"> 

</span><span class="cx">         m_lexErrorMessage = ASCIILiteral(&quot;\\u can only be followed by a Unicode character sequence&quot;);

</span><del>-        return character.valueType() == UnicodeHexValue::IncompleteHex ? StringUnterminated : StringCannotBeParsed;

</del><ins>+        return character.isIncomplete() ? StringUnterminated : StringCannotBeParsed;

</ins><span class="cx">     }

</span><span class="cx"> 

</span><span class="cx">     if (strictMode) {

</span></span></pre></div>

<a id="trunkSourceJavaScriptCoreparserLexerh"></a>

<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/parser/Lexer.h (183551 => 183552)</h4>

<pre class="diff"><span>

<span class="info">--- trunk/Source/JavaScriptCore/parser/Lexer.h        2015-04-29 16:32:05 UTC (rev 183551)

+++ trunk/Source/JavaScriptCore/parser/Lexer.h        2015-04-29 16:33:12 UTC (rev 183552)

</span><span class="lines">@@ -65,6 +65,8 @@

</span><span class="cx">     LexexFlagsDontBuildKeywords = 4

</span><span class="cx"> };

</span><span class="cx"> 

</span><ins>+struct ParsedUnicodeEscapeValue;

+

</ins><span class="cx"> template &lt;typename T&gt;

</span><span class="cx"> class Lexer {

</span><span class="cx">     WTF_MAKE_NONCOPYABLE(Lexer);

</span><span class="lines">@@ -138,42 +140,15 @@

</span><span class="cx">     void append8(const T*, size_t);

</span><span class="cx">     void record16(int);

</span><span class="cx">     void record16(T);

</span><ins>+    void recordUnicodeCodePoint(UChar32);

</ins><span class="cx">     void append16(const LChar*, size_t);

</span><span class="cx">     void append16(const UChar* characters, size_t length) { m_buffer16.append(characters, length); }

</span><span class="cx"> 

</span><span class="cx">     ALWAYS_INLINE void shift();

</span><span class="cx">     ALWAYS_INLINE bool atEnd() const;

</span><span class="cx">     ALWAYS_INLINE T peek(int offset) const;

</span><del>-    struct UnicodeHexValue {

-        

-        enum ValueType { ValidHex, IncompleteHex, InvalidHex };

-        

-        explicit UnicodeHexValue(int value)

-            : m_value(value)

-        {

-        }

-        explicit UnicodeHexValue(ValueType type)

-            : m_value(type == IncompleteHex ? -2 : -1)

-        {

-        }

</del><span class="cx"> 

</span><del>-        ValueType valueType() const

-        {

-            if (m_value &gt;= 0)

-                return ValidHex;

-            return m_value == -2 ? IncompleteHex : InvalidHex;

-        }

-        bool isValid() const { return m_value &gt;= 0; }

-        int value() const

-        {

-            ASSERT(m_value &gt;= 0);

-            return m_value;

-        }

-        

-    private:

-        int m_value;

-    };

-    UnicodeHexValue parseFourDigitUnicodeHex();

</del><ins>+    ParsedUnicodeEscapeValue parseUnicodeEscape();

</ins><span class="cx">     void shiftLineTerminator();

</span><span class="cx"> 

</span><span class="cx">     ALWAYS_INLINE int offsetFromSourcePtr(const T* ptr) const { return ptr - m_codeStart; }

</span></span></pre></div>

<a id="trunkSourceJavaScriptCoreruntimeStringConstructorcpp"></a>

<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/runtime/StringConstructor.cpp (183551 => 183552)</h4>

<pre class="diff"><span>

<span class="info">--- trunk/Source/JavaScriptCore/runtime/StringConstructor.cpp        2015-04-29 16:32:05 UTC (rev 183551)

+++ trunk/Source/JavaScriptCore/runtime/StringConstructor.cpp        2015-04-29 16:33:12 UTC (rev 183552)

</span><span class="lines">@@ -105,7 +105,7 @@

</span><span class="cx"> 

</span><span class="cx">         uint32_t codePoint = static_cast&lt;uint32_t&gt;(codePointAsDouble);

</span><span class="cx"> 

</span><del>-        if (codePoint != codePointAsDouble || codePoint &gt; 0x10FFFF)

</del><ins>+        if (codePoint != codePointAsDouble || codePoint &gt; UCHAR_MAX_VALUE)

</ins><span class="cx">             return throwVMError(exec, createRangeError(exec, ASCIILiteral(&quot;Arguments contain a value that is out of range of code points&quot;)));

</span><span class="cx"> 

</span><span class="cx">         if (U_IS_BMP(codePoint))

</span></span></pre></div>

<a id="trunkSourceWebCoreChangeLog"></a>

<div class="modfile"><h4>Modified: trunk/Source/WebCore/ChangeLog (183551 => 183552)</h4>

<pre class="diff"><span>

<span class="info">--- trunk/Source/WebCore/ChangeLog        2015-04-29 16:32:05 UTC (rev 183551)

+++ trunk/Source/WebCore/ChangeLog        2015-04-29 16:33:12 UTC (rev 183552)

</span><span class="lines">@@ -1,3 +1,34 @@

</span><ins>+2015-04-29  Darin Adler  &lt;darin@apple.com&gt;

+

+        [ES6] Implement Unicode code point escapes

+        https://bugs.webkit.org/show_bug.cgi?id=144377

+

+        Reviewed by Antti Koivisto.

+

+        Test: js/unicode-escape-sequences.html

+

+        * css/CSSParser.cpp:

+        (WebCore::CSSParser::parseEscape): Use ICU's UCHAR_MAX_VALUE instead of writing

+        out 0x10FFFF; clearer this way. Also use our replacementCharacter instead of

+        writing out 0xFFFD.

+

+        * html/parser/HTMLEntityParser.cpp:

+        (WebCore::isAlphaNumeric): Deleted.

+        (WebCore::HTMLEntityParser::legalEntityFor): Use ICU's UCHAR_MAX_VALUE and

+        U_IS_SURROGATE instead of writing the code out. Didn't use U_IS_UNICODE_CHAR

+        because that also includes U_IS_UNICODE_NONCHAR and thus would change behavior,

+        but maye it's something we want to do in the future.

+        (WebCore::HTMLEntityParser::consumeNamedEntity): Use isASCIIAlphanumeric instead

+        of a the function in this file that does the same thing less efficiently.

+

+        * html/parser/InputStreamPreprocessor.h:

+        (WebCore::InputStreamPreprocessor::processNextInputCharacter): Use

+        replacementCharacter from CharacterNames.h instead of writing out 0xFFFd.

+

+        * xml/parser/CharacterReferenceParserInlines.h:

+        (WebCore::consumeCharacterReference): Use ICU's UCHAR_MAX_VALUE instead of

+        defining our own local highestValidCharacter constant.

+

</ins><span class="cx"> 2015-04-29  Martin Robinson  &lt;mrobinson@igalia.com&gt;

</span><span class="cx"> 

</span><span class="cx">         [CMake] [GTK] Organize and clean up unused CMake variables

</span></span></pre></div>

<a id="trunkSourceWebCorecssCSSParsercpp"></a>

<div class="modfile"><h4>Modified: trunk/Source/WebCore/css/CSSParser.cpp (183551 => 183552)</h4>

<pre class="diff"><span>

<span class="info">--- trunk/Source/WebCore/css/CSSParser.cpp        2015-04-29 16:32:05 UTC (rev 183551)

+++ trunk/Source/WebCore/css/CSSParser.cpp        2015-04-29 16:33:12 UTC (rev 183552)

</span><span class="lines">@@ -10762,9 +10762,8 @@

</span><span class="cx">             unicode = (unicode &lt;&lt; 4) + toASCIIHexValue(*src++);

</span><span class="cx">         } while (--length &amp;&amp; isASCIIHexDigit(*src));

</span><span class="cx"> 

<del>-        // Characters above 0x10ffff are not handled.

-        if (unicode &gt; 0x10ffff)

-            unicode = 0xfffd;

</del><ins>+        if (unicode &gt; UCHAR_MAX_VALUE)

+            unicode = replacementCharacter;

</ins><span class="cx"> 

<span class="cx">         // Optional space after the escape sequence.

</span><span class="cx">         if (isHTMLSpace(*src))

</span></span></pre></div>

<a id="trunkSourceWebCorehtmlparserHTMLEntityParsercpp"></a>

<div class="modfile"><h4>Modified: trunk/Source/WebCore/html/parser/HTMLEntityParser.cpp (183551 => 183552)</h4>

<pre class="diff"><span>

<span class="info">--- trunk/Source/WebCore/html/parser/HTMLEntityParser.cpp        2015-04-29 16:32:05 UTC (rev 183551)

+++ trunk/Source/WebCore/html/parser/HTMLEntityParser.cpp        2015-04-29 16:33:12 UTC (rev 183552)

</span><span class="lines">@@ -32,9 +32,8 @@

</span><span class="cx"> #include &quot;HTMLEntitySearch.h&quot;

</span><span class="cx"> #include &quot;HTMLEntityTable.h&quot;

</span><span class="cx"> #include &lt;wtf/text/StringBuilder.h&gt;

</span><ins>+#include &lt;wtf/unicode/CharacterNames.h&gt;

</ins><span class="cx"> 

</span><del>-using namespace WTF;

-

</del><span class="cx"> namespace WebCore {

</span><span class="cx"> 

</span><span class="cx"> static const UChar windowsLatin1ExtensionArray[32] = {

</span><span class="lines">@@ -44,17 +43,12 @@

</span><span class="cx">     0x02DC, 0x2122, 0x0161, 0x203A, 0x0153, 0x009D, 0x017E, 0x0178, // 98-9F

</span><span class="cx"> };

</span><span class="cx"> 

</span><del>-static inline bool isAlphaNumeric(UChar cc)

-{

-    return (cc &gt;= '0' &amp;&amp; cc &lt;= '9') || (cc &gt;= 'a' &amp;&amp; cc &lt;= 'z') || (cc &gt;= 'A' &amp;&amp; cc &lt;= 'Z');

-}

-

</del><span class="cx"> class HTMLEntityParser {

</span><span class="cx"> public:

</span><span class="cx">     static UChar32 legalEntityFor(UChar32 value)

</span><span class="cx">     {

</span><del>-        if (value &lt;= 0 || value &gt; 0x10FFFF || (value &gt;= 0xD800 &amp;&amp; value &lt;= 0xDFFF))

-            return 0xFFFD;

</del><ins>+        if (value &lt;= 0 || value &gt; UCHAR_MAX_VALUE || U_IS_SURROGATE(value))

+            return replacementCharacter;

</ins><span class="cx">         if ((value &amp; ~0x1F) != 0x80)

</span><span class="cx">             return value;

</span><span class="cx">         return windowsLatin1ExtensionArray[value - 0x80];

</span><span class="lines">@@ -104,7 +98,7 @@

</span><span class="cx">         }

</span><span class="cx">         if (entitySearch.mostRecentMatch()-&gt;lastCharacter() == ';'

</span><span class="cx">             || !additionalAllowedCharacter

</span><del>-            || !(isAlphaNumeric(cc) || cc == '=')) {

</del><ins>+            || !(isASCIIAlphanumeric(cc) || cc == '=')) {

</ins><span class="cx">             decodedEntity.append(entitySearch.mostRecentMatch()-&gt;firstValue);

</span><span class="cx">             if (entitySearch.mostRecentMatch()-&gt;secondValue)

</span><span class="cx">                 decodedEntity.append(entitySearch.mostRecentMatch()-&gt;secondValue);

</span></span></pre></div>

<a id="trunkSourceWebCorehtmlparserInputStreamPreprocessorh"></a>

<div class="modfile"><h4>Modified: trunk/Source/WebCore/html/parser/InputStreamPreprocessor.h (183551 => 183552)</h4>

<pre class="diff"><span>

<span class="info">--- trunk/Source/WebCore/html/parser/InputStreamPreprocessor.h        2015-04-29 16:32:05 UTC (rev 183551)

+++ trunk/Source/WebCore/html/parser/InputStreamPreprocessor.h        2015-04-29 16:33:12 UTC (rev 183552)

</span><span class="lines">@@ -30,6 +30,7 @@

</span><span class="cx"> 

</span><span class="cx"> #include &quot;SegmentedString.h&quot;

</span><span class="cx"> #include &lt;wtf/Noncopyable.h&gt;

</span><ins>+#include &lt;wtf/unicode/CharacterNames.h&gt;

</ins><span class="cx"> 

</span><span class="cx"> namespace WebCore {

</span><span class="cx"> 

</span><span class="lines">@@ -115,7 +116,7 @@

</span><span class="cx">                     m_nextInputCharacter = source.currentChar();

</span><span class="cx">                     goto ProcessAgain;

</span><span class="cx">                 }

</span><del>-                m_nextInputCharacter = 0xFFFD;

</del><ins>+                m_nextInputCharacter = replacementCharacter;

</ins><span class="cx">             }

</span><span class="cx">         }

</span><span class="cx">         return true;

</span></span></pre></div>

<a id="trunkSourceWebCorexmlparserCharacterReferenceParserInlinesh"></a>

<div class="modfile"><h4>Modified: trunk/Source/WebCore/xml/parser/CharacterReferenceParserInlines.h (183551 => 183552)</h4>

<pre class="diff"><span>

<span class="info">--- trunk/Source/WebCore/xml/parser/CharacterReferenceParserInlines.h        2015-04-29 16:32:05 UTC (rev 183551)

+++ trunk/Source/WebCore/xml/parser/CharacterReferenceParserInlines.h        2015-04-29 16:33:12 UTC (rev 183552)

</span><span class="lines">@@ -54,7 +54,6 @@

</span><span class="cx">     } state = Initial;

</span><span class="cx">     UChar32 result = 0;

</span><span class="cx">     bool overflow = false;

</span><del>-    const UChar32 highestValidCharacter = 0x10FFFF;

</del><span class="cx">     StringBuilder consumedCharacters;

</span><span class="cx">     

</span><span class="cx">     while (!source.isEmpty()) {

</span><span class="lines">@@ -107,7 +106,7 @@

</span><span class="cx">         Hex:

</span><span class="cx">             if (isASCIIHexDigit(character)) {

</span><span class="cx">                 result = result * 16 + toASCIIHexValue(character);

</span><del>-                if (result &gt; highestValidCharacter)

</del><ins>+                if (result &gt; UCHAR_MAX_VALUE)

</ins><span class="cx">                     overflow = true;

</span><span class="cx">                 break;

</span><span class="cx">             }

</span><span class="lines">@@ -126,7 +125,7 @@

</span><span class="cx">         Decimal:

</span><span class="cx">             if (isASCIIDigit(character)) {

</span><span class="cx">                 result = result * 10 + character - '0';

</span><del>-                if (result &gt; highestValidCharacter)

</del><ins>+                if (result &gt; UCHAR_MAX_VALUE)

</ins><span class="cx">                     overflow = true;

</span><span class="cx">                 break;

</span><span class="cx">             }

</span></span></pre>

</div>

</div>

</body>

</html>