[Webkit-unassigned] [Bug 25415] [GTK][ATK] Please implement support for get_text_at_offset

Thu May 14 08:14:45 PDT 2009

https://bugs.webkit.org/show_bug.cgi?id=25415

------- Comment #6 from xan.lopez at gmail.com  2009-05-14 08:14 PDT -------
(In reply to comment #5)
> Xan this is awesome! Thanks!!
> 
> > OK, all the stuff is implemented now. I don't have tests for the
> > LINE_{START,END} boundaries because I don't seem to be able to convince WebKit
> > to load strings with line breaks.
> 
> When I read this, I initially took it to mean the functionality for ATs was
> implemented (i.e. ATs could get the text at/before/after the LINE_{START,END}
> boundary) and that there were WebKit tests that were not yet in place. In
> testing it, I'm seeing ('', 0, 0) when I try to get the text of a line. No
> worries as long as you know that line support doesn't seem to be implemented.

Mmm, I thought I had, but it seems I was wrong. Looking at what Gecko does it
seems to me that for LINE_{START,END} we want the *visual* lines, not the
logical ones, right? That is, what is returned in a getText(0, -1) might be
"One two three four five six seven eight nine ten.", but if that's split in 5
lines of two items in the browser getTextAtOffset(0, TEXT_BOUNDARY_LINE_START)
would be "one two". Am I right?

> 
> I also noticed that your implementation doesn't include the character(s)
> serving as the boundary (i.e. space and/or punctuation mark), whereas other
> apps and toolkits do include it. For instance, given the sentence "This is
> another, silly test." and using Accerciser's IPython console:
> 
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Gedit and OOo Writer:
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> In [1]: text = acc.queryText()
> In [2]: text.getTextAtOffset(8, TEXT_BOUNDARY_WORD_START)
> Out[2]: ('another, ', 8, 17)
> In [3]: text.getTextAtOffset(8, TEXT_BOUNDARY_WORD_END)
> Out[3]: (' another', 7, 15)
> In [4]: text.getTextAtOffset(23, TEXT_BOUNDARY_WORD_START)
> Out[4]: ('test.', 23, 28)
> In [5]: text.getTextAtOffset(23, TEXT_BOUNDARY_WORD_END)
> Out[5]: (' test', 22, 27)
> 
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Gecko:
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> In [1]:text = acc.queryText()
> In [2]:text.getTextAtOffset(8, TEXT_BOUNDARY_WORD_START)
> Out[2]: ('another, ', 8, 17)
> In [3]:text.getTextAtOffset(8, TEXT_BOUNDARY_WORD_END)
> Out[3]: (' another,', 7, 16)
> In [4]:text.getTextAtOffset(23, TEXT_BOUNDARY_WORD_START)
> Out[4]: ('test.', 23, 28)
> In [5]:text.getTextAtOffset(23, TEXT_BOUNDARY_WORD_END)
> Out[5]: (' test.', 22, 28)
> 
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> WebKit:
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> In [1]: text = acc.queryText()
> In [2]: text.getTextAtOffset(8, TEXT_BOUNDARY_WORD_START)
> Out[2]: ('another', 8, 15)
> In [3]: text.getTextAtOffset(8, TEXT_BOUNDARY_WORD_END)
> Out[3]: ('another', 8, 15)
> In [4]: text.getTextAtOffset(23, TEXT_BOUNDARY_WORD_START)
> Out[4]: ('test', 23, 27)
> In [5]: text.getTextAtOffset(23, TEXT_BOUNDARY_WORD_END)
> Out[5]: ('test', 23, 27)
> 
> I'm not convinced the inclusion of the space in each word is critical (although
> given that Gedit, OOo, and Gecko all do it, perhaps it is??). That said, the
> inclusion of any punctuation that defines the boundary is important. If a user
> presses Control+Right and moves to the word 'another,' Orca needs to know that
> there's a comma attached to the word so that we display it in braille and --
> when appropriate -- present it in speech.

OK, I'll look into this.

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.