[webkit-reviews] review granted: [Bug 25165] First step towards replacing PCRE and completing regular expression JIT. : [Attachment 29442] Yarr!
bugzilla-daemon at webkit.org
bugzilla-daemon at webkit.org
Mon Apr 13 17:30:23 PDT 2009
Geoffrey Garen <ggaren at apple.com> has granted 's request for review:
Bug 25165: First step towards replacing PCRE and completing regular expression
JIT.
https://bugs.webkit.org/show_bug.cgi?id=25165
Attachment 29442: Yarr!
https://bugs.webkit.org/attachment.cgi?id=29442&action=review
------- Additional Comments from Geoffrey Garen <ggaren at apple.com>
I'm ignoring commented-out code, because Gavin told me that he had a local
change to remove it.
Let's rename the NARC #define to YARR_JIT.
+#if !ENABLE(YARR)
~RegExp();
+#endif
You can reduce the #ifdef'ing if you leave in ~RegExp in all builds, and just
leave out the implementation in YARR builds.
+#if ENABLE(NARC)
+ jitCompileRegex(globalData, m_regExpJITCode, m_pattern, m_numSubpatterns,
m_constructionError, ignoreCase(), multiline());
+#else
+ UNUSED_PARAM(globalData);
+ m_regExpBytecode.set(Yarr::byteCompileRegex(m_pattern, m_numSubpatterns,
m_constructionError, ignoreCase(), multiline()));
+#endif
Kind of strange that the JIT version of the code does not use the Yarr
namespace explicitly, but the bytecode version of the code does.
+CharacterClass* newlineOnce()
+{
+ static CharacterClass* once = newlineCreate();
+ return once;
+}
DANGER WILL ROBINSON. DANGER. Static initialization like this is not
thread-safe. If we want to dynamically allocate these character classes, we
need to store them in the parser, or the JSGlobalData.
I think function names like this would be easier to read as
"newlineCharacterClass()", etc., instead of "newlineOnce()", etc.
+ // If the pattern contains illegal backreferences reset & reparse.
+ if (pattern.containsIllegalBackReference()) {
Let's be more explicit and mention that we're doing this to match a quirk in
Firefox's behavior.
+ DisjunctionContext* allocDisjunctionContext(ByteDisjunction* disjunction)
+ {
+ return new(malloc(sizeof(DisjunctionContext) +
(disjunction->m_frameSize - 1) * sizeof(uintptr_t))) DisjunctionContext();
+ }
+
+ void freeDisjunctionContext(DisjunctionContext* context)
+ {
+ free(context);
+ }
I think this code, and its client, would be cleaner if you made
DisjunctionContext refcounted, and used RefPtr to manage its lifetime
automatically.
+ BytecodePattern* compile()
+ {
+ regexBegin(m_pattern.m_body->m_callFrameSize);
+ emitDisjunction(m_pattern.m_body);
+ regexEnd();
+
+ BytecodePattern* bytecode = new BytecodePattern(bodyDisjunction,
m_pattern.m_ignoreCase, m_pattern.m_multiline, m_pattern.m_numSubpatterns,
m_allParenthesesInfo, m_pattern.m_userCharacterClasses);
+ bodyDisjunction = 0;
+ m_allParenthesesInfo.shrink(0);
+ m_pattern.m_userCharacterClasses.clear();
+ return bytecode;
+ }
RefPtr here would be good, too.
+ ~BytecodePattern()
+ {
+ delete m_body;
+ for (unsigned i = 0; i < m_allParenthesesInfo.size(); ++i)
+ delete m_allParenthesesInfo[i];
+ for (unsigned i = 0; i < m_userCharacterClasses.size(); ++i)
+ delete m_userCharacterClasses[i];
+ }
And here.
+struct RegexJITObject {
+ void* m_jitCode;
+ RefPtr<ExecutablePool> m_executablePool;
+
+ RegexJITObject()
+ : m_jitCode(0)
+ {
+ }
+};
Should we call this something more specific, like Yarr::CodeBlock? "Object" is
a pretty generic name for an object.
+// Yarr::parse():
+//
Please use "/**/" syntax for multiline comments.
+ static const unsigned s_maxPatternSize = (1 << 13);
I think you want to standardize with the use of MAX_PATTERN_SIZE elsewhere, to
avoid differences between the engines.
r=me
More information about the webkit-reviews
mailing list