[Webkit-unassigned] [Bug 149056] [GTK] Spellchecker rejects word when adding a period character if there is no trailing space before the next word

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Wed Feb 24 04:42:55 PST 2016


https://bugs.webkit.org/show_bug.cgi?id=149056

--- Comment #13 from Adrien Plazas <aplazas at igalia.com> ---
(In reply to comment #12)
> Here is the ICU documentation:
> http://userguide.icu-project.org/boundaryanalysis
> 
> Note that each of the four period characters is clearly shown to be a word
> boundary character in the example "Your balance is $1234.56... I think."

Only the three trailing ones are but not the one in the middle of the "word".

Also in http://www.unicode.org/reports/tr29/#Word_Boundaries:
"The goal of matching user perceptions cannot always be met exactly because the text alone does not always contain enough information to unambiguously decide boundaries. For example, the period (U+002E FULL STOP) is used ambiguously, sometimes for end-of-sentence purposes, sometimes for abbreviations, and sometimes for numbers. In most cases, however, programmatic text boundaries can match user perceptions quite closely, although sometimes the best that can be done is not to surprise the user."

They can't always decide when to break because of ambiguity and try to be conservative which is the right thing to do from their POV, letting thir users to choose what to do when there is ambiguity.

I don't see how to fix the spell checking without splitting the words into smaller pieces and maybe they have stronger rules to split than "words", but otherwise I don't think it's far fetched to split the words again ourselves.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.webkit.org/pipermail/webkit-unassigned/attachments/20160224/a9b2d42f/attachment.html>


More information about the webkit-unassigned mailing list