[Webkit-unassigned] [Bug 26590] Support for String.trim(), String.trimLeft() and String.trimRight() methods

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Tue Jun 23 15:33:24 PDT 2009


------- Comment #13 from ioseb.dzmanashvili at gmail.com  2009-06-23 15:33 PDT -------
(In reply to comment #11)


Thank you for detailed explanations and suggestions! I've changed the code.

> I have doubts about is the isWhiteSpace function.
> There is already an isWhiteSpace function in Lexer, an isStrWhiteSpace function
> in JSGlobalObjectFunctions, and WTF::Unicode::isSeparatorSpace.
> I suspect that this new isWhiteSpace function is identical to
> WTF::Unicode::isSeparatorSpace -- could you check on that?

I examined functions you mentioned before I wrote my own version.
WTF::Unicode::isSeparatorSpace only checks if character is "space" and nothing
more, I already tried the function before.

isStrWhiteSpace from JSGlobalObjectFunctions and isWhiteSpace from Lexer
functions are not enough as far as they check very limited amount of possible
whitespaces. Whitespace list I used in my own version is recommended by Ecma
3.1 spec.

> Also, I'm not sure it's appropriate for the large isWhiteSpace function to be
> marked inline, and I suspect that the const int array ends up being re-created
> and initialized every time the function is called. None of this matters if we
> can just use isSeparatorSpace.

I changed function to be inline no more. I analyzed code more carefully and
found that there were two ranges of characters - one from 0x09 to 0x0d and
second 0x2000 to 0x200b, without these ranges there left very few whitespace
characters so I changed implementation and made it similar to the function from
JSGlobalObjectFunctions.. so array re-creation and re-initialization is not the
case anymore I think.

> > +        (JSC::):
> Please don't leave a line like this in the ChangeLog, even if the
> prepare-ChangeLog script puts it there.


> I think the logic would be slightly clearer if end pointed past the last
> character instead of at the last character. You would have to say "end - 1" in
> the place you call isWhiteSpace, but otherwise it would be easier to read the
> function.


> I suggest using unsigned instead of int for end, start, and sSize. You only
> need a signed int because of end being a pointer before the last character, I
> think. You could get rid of the type casts, then.


> I think sSize could just be named size.


> No braces around single line bodies of while. Need a space after if and before
> the open parenthesis.


> This needs a special case to return sVal when start is 0 and end is sSize. I
> think it's common to call trim and not trim anything, and in that case we don't
> want to allocate a new JSString cell.


> There is no need for the special case for a size of zero. The rest of the
> function already does the right thing for that case already, including the
> jsSubstring function which won't allocate anything for a zero-length string,
> and I don't think an empty string needs extra performance optimization.


> > +var successfullyParsed = true;
> > \ No newline at end of file
> Please add the newline here.


> I'd like to see more test cases of strings that don't need to be trimmed,
> including characters that might seem to be spaces but don't qualify, such as
> control characters. If we can make the test run fast enough it might even be
> good to find a way to test many other characters to see they are not trimmed. I
> can't think of an efficient way to do it on all 2^21 characters, though, so you
> might have to be creative.

Maybe I'm wrong, but in test cases I'm using whitespace character list
recommended by Ecma 3.1 spec... other whitespace alike characters couldn't be
treated as whitespace chars because characters are matched against predefined

Thank you for being responsive and helpful!

Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

More information about the webkit-unassigned mailing list