[Webkit-unassigned] [Bug 29092] Performance slow when loading a large text html file on Symbian platform

Fri Sep 18 15:02:28 PDT 2009

https://bugs.webkit.org/show_bug.cgi?id=29092

--- Comment #12 from Chang Shu <Chang.Shu at nokia.com>  2009-09-18 15:02:28 PDT ---
(In reply to comment #11)

Thanks for the review. Just a few comments below before next patch.

> Typo: should be "scanned".
> Please add a space after "if".

Will fix the typo and coding style in next patch.

> +        TextBreakIterator* it = characterBreakIterator(data.characters() +
> start, end - start + 1);
> How will this behave if a non-BMP character is coming next (one that takes two
> UTF-16 values)? Depending on the platform implementation of the iterator, it
> may not work well.
> Seems that this needs at least "+ 2". I think that it would work then, but I'm also not 100% sure.

To make it safe, I can use a much larger number. All I want to avoid is a huge
length like 100000 or even larger.

> >     // If we have maxChars of unbreakable characters the above could lead to
> >     // an infinite loop.
> This comment needs to be updated, now that we rely on the code below in more
> cases.

Any suggestion on how to update this comment? I don't think my changes above
changed the current over-all behavoir. However, if we want to change the
behavior, such as search forward for the next break instead of return the whole
buffer, my change will cause problem. And in this case, the suggestion
below(using the entire remainder) is a better solution.

> What exactly gave the performance boost for Qt, trimming from the beginning, or
> from the end? If the latter is not necessary, I suggest passing the entire
> remainder of the text to iterator.

In Qt implementation, when a text is passed to function
characterBreakIterator(), a corresponding flag buffer is allocated (each text
char is associated with several flags). Then the text is scanned for several
times and the flags are filled. For details, see
WebCore/platform/text/qt/TextBreakIteratorQt.cpp in webkit and
corelib/tools/qtextboundaryfinder.cpp in qt. Without the improvement, the whole
text is passed to this function everytime a piece of 64k of text is chopped
off. Say the text length is n*64K and the time in createWithLengthLimit() using
entire text is T, then the overall time are roughly:
Original code: nT
My code: T
Using remainder of the text: nT/2.
But apparently, the last option is safe and has advantages in the future.

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.