[webkit-reviews] review denied: [Bug 41238] RegExp performance slow on Dromaeo benchmark : [Attachment 59949] The patch

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Mon Jun 28 15:20:37 PDT 2010


Darin Adler <darin at apple.com> has denied Michael Saboff <msaboff at apple.com>'s
request for review:
Bug 41238: RegExp performance slow on Dromaeo benchmark
https://bugs.webkit.org/show_bug.cgi?id=41238

Attachment 59949: The patch
https://bugs.webkit.org/attachment.cgi?id=59949&action=review

------- Additional Comments from Darin Adler <darin at apple.com>
This looks great!

> +    , m_lastMatchStr(UString::null())

This shouldn't be needed. A UString will start out null.

> +    , m_lastMatchStart(-1)
>  {
> +    m_lastOVector.resize(0);

This shouldn't be needed. The vector should start out empty.

> +    , m_lastMatchStr(UString::null())

This shouldn't be needed. A UString will start out null.

> +    , m_lastMatchStart(-1)
>  {
> +    m_lastOVector.resize(0);

This shouldn't be needed. The vector should start out empty.

> +	   m_lastMatchStr = UString::null();

I think this should just be:

    m_lastMatchStr = UString();

You should check which is more efficient, but I think my form is.

> +	   m_lastOVector.resize(0);

It's more efficient to call shrink(0) than resize(0).

> +    // Check to see if this match call is the same as the last match
invocation

We normally use sentence style for comments, with a period.

> +    if ((startOffset == m_lastMatchStart) && (s.rep() ==
m_lastMatchStr.rep())) {

We normally

> +	   if (ovector) {
> +	       (Vector<int, 32>&)*ovector = (Vector<int, 32>&)m_lastOVector;
> +	   }

I don't understand why you need these typecasts. They should not be needed.

Also, we don't use braces around single line if statement bodies. It should be:


    if (ovector)
	*ovector = m_lastOVector;

> +	   if (m_lastOVector.isEmpty())
> +	       return -1;
> +	   else
> +	       return m_lastOVector.at(0);

We normally don't do else after return, so the else should be removed.

> +	   if (ovector)
> +	       (Vector<int, 32>&)m_lastOVector = (Vector<int, 32>&)*ovector;
> +	   else
> +	       (Vector<int, 32>&)m_lastOVector = (Vector<int,
32>&)nonReturnedOvector;

Same comments about typecasts.

>	   unsigned m_numSubpatterns;
> +	   UString m_lastMatchStr;

We normally try to avoid abbreviations. So I suggest you either call this
m_lastMatchString or m_lastMatch and leave out the "str".

>      ALWAYS_INLINE bool operator==(const UString& s1, const UString& s2)
>      {
>	   unsigned size = s1.size();
> +	   
> +	   if (size != s2.size())      // Sizes not the same, we're done.
> +	       return false;
> +	       
> +	   if (s1.data() == s2.data()) // Quick check of data pointers being
the same.
> +	       return true;

Another way to do this optimization is to check if s1.rep() is == s2.rep().
That would be more efficient than the repeated calls to data() this function
has. And I think it should be relatively rare to have the two data pointers to
match if they are not the exact same rep. You could put the rep check before
the size check, in fact.

review- because I think you should do at least one of the things I suggested
above


More information about the webkit-reviews mailing list