[Webkit-unassigned] [Bug 210502] New: [GTK] TextNode::splitText() can lose content
bugzilla-daemon at webkit.org
bugzilla-daemon at webkit.org
Tue Apr 14 09:42:09 PDT 2020
https://bugs.webkit.org/show_bug.cgi?id=210502
Bug ID: 210502
Summary: [GTK] TextNode::splitText() can lose content
Product: WebKit
Version: Other
Hardware: Unspecified
OS: Unspecified
Status: NEW
Severity: Normal
Priority: P2
Component: WebKitGTK
Assignee: webkit-unassigned at lists.webkit.org
Reporter: mcrha at redhat.com
CC: bugs-noreply at webkitgtk.org
Created attachment 396429
--> https://bugs.webkit.org/attachment.cgi?id=396429&action=review
How it looks like in Firefox
Just noticed that calling splitText() in the middle of a multi-unicode character causes content lost on both sides. This is with trunk at r259630.
Steps:
a) run: MiniBrowser --editor-mode
b) open the Inspector and in its console run: document.body.innerText = ""
c) still in the inspector run: document.body.firstChild.splitText(2)
* all is fine, the Elements tab shows the text properly split into one and two Emojis
d) still in the inspector run: document.body.firstChild.nextSibling.splitText(1)
The outcome after d) are three text nodes in the body, the first showing the first Emoji, the second being empty text, the third with probably two letters, looks like whitespaces, though:
document.body.firstChild.nextSibling.nodeValue.length
1
document.body.firstChild.nextSibling.nodeValue.charCodeAt(0)
55357
document.body.firstChild.nextSibling.nextSibling.nodeValue.length
3
document.body.firstChild.nextSibling.nextSibling.nodeValue.charCodeAt(0)
56841
document.body.firstChild.nextSibling.nextSibling.nodeValue.charCodeAt(1)
55357
document.body.firstChild.nextSibling.nextSibling.nodeValue.charCodeAt(2)
56898
I do not know what to expect from this, but that one can break "a letter" in the middle and have it completely lost with the next letter is not ideal.
Calling:
- document.body.normalize() fixes the situation like being after the step b).
- it seems the splitText() is correct (see above), but the visual interpretation is broken (at least the second Emoji might be visible, it may not look like a whitespace).
I tried with Firefox (67.0) and it behaves similarly (also two characters per Emoji), but the splitText call has no impact on the visual interpretation in the document body. It has impact on the interpretation in the Inspector (the inspector shows letters it cannot visualize as rectangles with the hexa code).
-------------------------------------------
Side notes:
Are there any sequences using multi-unicode characters, like in some Chinese variants or such?
That the Emoji occupies two characters is impractical with line length calculations too, even though they are drawn as a single character. I know of "composite" Emojis, which is even bigger nightmare on many fronts.
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-unassigned/attachments/20200414/733277ac/attachment-0001.htm>
More information about the webkit-unassigned
mailing list