[webkit-dev] Feedback on Blink's text fragment directive proposal

Thu Sep 24 12:28:21 PDT 2020

On Wed, Sep 23, 2020 at 3:20 AM Ryosuke Niwa <rniwa at webkit.org> wrote:

>
> On Fri, Sep 18, 2020 at 7:35 AM David Bokan <bokan at chromium.org> wrote:
>
>> Friendly ping to get an answer here.
>>
>> Do my answers above address those points or is there anything else I can
>> clarify?
>>
>> Thanks,
>> David
>>
>> On Mon, Aug 31, 2020 at 1:42 PM David Bokan <bokan at chromium.org> wrote:
>>
>>> [sending (again, sorry) from correct e-mail]
>>>
>>> I think Nick's replies mostly still apply, some updated answers to
>>> those questions.
>>>
>>> (1) We’re concerned about compatibility issues in a world where some
>>>> browsers support this but not all. Aware browsers will strip `:~:`, but
>>>> unaware browsers won’t. I saw that on the blink-dev ItS thread, it was
>>>> mentioned that at least one site (webmd.com) totally breaks if any
>>>> fragment ID is exposed to the page. This makes it difficult to create a
>>>> link that uses this feature but which is safe in all browsers:
>>>> - Since there is no feature detection mechanism, it’s hard for a
>>>> webpage to know whether it should issue such a link. It would have to be
>>>> based on UA string checks, which is regrettable.
>>>> - A link meant for a supporting browser can end up in a non-supporting
>>>> browser, at the very least by copy paste from the URL field, and perhaps
>>>> through other features to share a link.
>>>>
>>>
>>> We do have a feature detection mechanism for this.
>>>
>>> On the latter point, this is true but we think implementing fragment
>>> directive stripping (removing the part after and including `:~:`) is
>>> trivial even if the UA doesn't wish to implement the text-fragment feature.
>>> FWIW, we haven't seen or heard of another such example since.
>>>
>>
> We're continued to be concerned about this backwards compatibility issue.
>

Is there any kind of data we could gather that might allay concerns? Or
mitigations we could consider? Applications that generate these links
dynamically can feature detect for UA support. Pages should already be
considering unexpected hashes; the WebMD so far seems to have been an
outlier.

>
> (3) Text fragment trumping a regular fragment ID seems a bit strange. The
>>>> more natural semantic would be that the text search starts at the fragment,
>>>> so if there are multiple matches it’s possible to scroll to a more specific
>>>> one. It’s not clear why the fragment is instead entirely ignored.
>>>>
>>>
>>> This was discussed in more detail in issue#75; I agree with Nick's
>>> point that the disambiguation syntax is already specific enough that
>>> starting from a fragment isn't necessary. This also keeps us
>>> mostly-compatible with the TextQuoteSelector specified in
>>> WebAnnotations which I think may have benefits for interaction with
>>> annotation applications.
>>>
>>
> This will limit the utility of this feature. For something as board
> impacting as a URL format change, it seems rather short sighted.
>

Could you elaborate on why you think this limits its utility? From my point
of view keeping them independent is conceptually simpler and more robust
since we don't have to depend on two aspects of the page being unchanged.
Given that the syntax allows precise targeting of ambiguous text snippets I
don't really see a clear downside to this but maybe I'm missing your point?

>
> Also, Web Annotations Data Model allows other kinds of annotations:
> https://www.w3.org/TR/2017/REC-annotation-model-20170223/#selectors
>
> Is there any reason this particular matching algorithm was picked and only
> picked with no possibility of the future extensibility?
>

You mean why of all the selectors specified there only TextQuoteSelector
was chosen? We started with text as we think it's the most useful of the
set but this doesn't preclude eventually adding others. One natural
extension that we've heard demand for is scrolling to images.

Our original exploration
<https://github.com/WICG/scroll-to-text-fragment/#css-selector-fragments>looked
at using arbitrary CSS selectors but this got rather complicated as being
able to target arbitrary parts of the DOM seemed potentially scary
from a security
perspective
<https://github.com/WICG/scroll-to-text-fragment/#security-considerations>
(e.g.
a security flaw might expose CSRF tokens rather than just text).

As to the fragment syntax provided in WebAnnotations, there's two reasons
we chose a different syntax:

  * We needed some way to hide the fragment from the page so that it works
on pages with fragment routing
  * The WebAnnotations fragment syntax is quite verbose. We believe there's
benefit to keeping these links shorter and easier to hand-craft.

However, the model is effectively the same (exception being WebAnnotations
doesn't support start/end ranges); a WebAnnotation TextQuoteSelector can be
mechanically converted to a text-fragment.

>
> - R. Niwa
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20200924/93b2e54d/attachment.htm>