[webkit-reviews] review denied: [Bug 38142] RegExp caching : [Attachment 56362] RegExp caching

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Tue May 18 20:31:31 PDT 2010


Geoffrey Garen <ggaren at apple.com> has denied Renata Hodovan
<hodovan.renata at stud.u-szeged.hu>'s request for review:
Bug 38142: RegExp caching
https://bugs.webkit.org/show_bug.cgi?id=38142

Attachment 56362: RegExp caching
https://bugs.webkit.org/attachment.cgi?id=56362&action=review

------- Additional Comments from Geoffrey Garen <ggaren at apple.com>
+    RefPtr<RegExp> cachedRegExp;
+
+    if (!flags.isNull())
+	 cachedRegExp = RegExp::create(m_globalData, patternString, flags);
+    else
+	 cachedRegExp = RegExp::create(m_globalData, patternString);

This code would be much simpler if you changed RegExp::create to always take a
flags argument, but do something efficient when it is null. That might be a
good thing to do in a future patch.


+    if (patternString.size() < maxCacheablePatternLength) {
+	 ++m_ptr;
+	 if (m_ptr >= maxCacheableEntries) {
+	     m_ptr = 0;
+	     m_isFull = true;
+	 }
+	 if (m_isFull)
+	     m_cacheMap.remove(RegExpKey(patternKeyArray[m_ptr].flagsValue,
patternKeyArray[m_ptr].pattern));
+
+	 RegExpKey key = RegExpKey(flags, patternString);
+	 m_cacheMap.set(key, cachedRegExp);
+	 patternKeyArray[m_ptr].flagsValue = key.flagsValue;
+	 patternKeyArray[m_ptr].pattern = patternString.rep();
+    }
+    return cachedRegExp;

The WebKit style for code like this is to check for the failure case and return
early. One advantage of this style is that less code ends up indented. So:

if (patternString.size() >= maxCacheablePatternLength)
    return cachedRegExp;

I wouldn't call the regexp 'cachedRegExp', nor would I call the function
'createAndCache', since the regexp might not be cached. How about "regeExp" and
"create" instead?

You define m_isFull as a data member, but only use it locally in
createAndCache. I think you can get rid of it entirely, and just rely on "if
(m_ptr >= maxCacheableEntries)" in createAndCache.

lookupOrCreate and createAndCache combine to do more hash lookups and RegExpKey
construction than necessary. Here's what I'd recommend:

1. lookupOrCreate should say "pair <iterator, bool> result =
m_cacheMap.add(RegExpKey(flags, patternString), 0)"
2. if result.second is false, there was a pre-existing entry in the map. You
can return it.
3. if result.second is true, you've added a 0 to the map, and result.first is
an iterator pointing at that zero. You can now use that iterator to add a real
value to the map.

I think that patternKeyArray is unnecessary complexity. I guess your goal is
LRU eviction. However, that's not what you'll get. When the cache is full,
you'll evict the oldest item, and whatever item happens to collide with the
item you're trying to add. There's no guarantee that evicting the oldest item
will solve your collision problem.

You could go to greater lengths to implement a strict LRU cache. But what I
would recommend is just to evict upon collision. All this would require is
changing the call to add() that I suggested into a call to set() in the case
where "m_cacheMap.size() > maxCacheableEntries".

r- because I think there are still good improvements to be made here.


More information about the webkit-reviews mailing list