[Webkit-unassigned] [Bug 94202] New: [GTK] Bad utf8 data is being passed to enchant_dict_check

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Thu Aug 16 00:54:23 PDT 2012


https://bugs.webkit.org/show_bug.cgi?id=94202

           Summary: [GTK] Bad utf8 data is being passed to
                    enchant_dict_check
           Product: WebKit
           Version: 528+ (Nightly build)
          Platform: Unspecified
        OS/Version: Unspecified
            Status: NEW
          Keywords: Gtk
          Severity: Normal
          Priority: P2
         Component: WebKit Gtk
        AssignedTo: webkit-unassigned at lists.webkit.org
        ReportedBy: msanchez at igalia.com
                CC: mrobinson at webkit.org


I observed today the following error in the bots, when running certain layout tests like the following one:

  23:47:55.622 4977 worker/22 editing/selection/move-by-word-visually-single-space-inline-element.html output stderr lines:
  23:47:55.622 4977   enchant_dict_check: assertion `g_utf8_validate(word, len, NULL)' failed
  23:47:55.622 4977   enchant_dict_check: assertion `g_utf8_validate(word, len, NULL)' failed
  23:47:55.623 4977   enchant_dict_check: assertion `g_utf8_validate(word, len, NULL)' failed
  23:47:55.623 4977   enchant_dict_check: assertion `g_utf8_validate(word, len, NULL)' failed
  [...] << repeats some more times >>

So, I briefly investigated the issue and it seems the problem is easily fixable by doing this:

  --- a/Source/WebCore/platform/text/gtk/TextCheckerEnchant.cpp
  +++ b/Source/WebCore/platform/text/gtk/TextCheckerEnchant.cpp
  @@ -115,7 +115,7 @@ void TextCheckerEnchant::checkSpellingOfString(const String& string, int& misspe
               g_utf8_strncpy(word.get(), cstart, wordLength);

               for (; dictIter != m_enchantDictionaries.end(); ++dictIter) {
  -                if (enchant_dict_check(*dictIter, word.get(), wordLength)) {
  +                if (enchant_dict_check(*dictIter, word.get(), bytes)) {
                       misspellingLocation = start;
                       misspellingLength = wordLength;
                   } else {

The explanation is that the 'length' parameter in enchant_dict_check accepts a number of bytes and not the number of utf8 characters, so it will fail in cases like this:

  word: דעפ => total characters: 3 / total bytes: 6

Thus, of course a call to enchant_dict_check with 3 as length will fail

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.


More information about the webkit-unassigned mailing list