webkit editing rewrite?
At the editing meeting at the WebKit conference in April, we discussed the idea of coming up with a replacement to Position/Range, using that throughout the editing code and then eventually exposing that API to the web to supersede DOM Ranges. Specifically, the idea was to get rid of node/offset pairs and instead having the following possible positions: beforeNode, nodeStart, afterNode, nodeEnd. We would then only use offsets for positions inside text nodes. There are also a slew of other APIs web developers need to make editing work well, e.g., undo management, better selection control, position normalizing, etc. Finally, the editing code is crashtastic. Some of that is due to less than great foundations like the current Position class. Some of it is due to editing just being complicated. Some of us had a somewhat crazy idea to rewrite much of the editing code (e.g. document.execCommand) in JavaScript. Pros: -Ensures that the APIs we expose to the web are at least good enough for our own editing code -Ensures that editing code never crashes (outside of JSC/V8 bugs) -Gives a clean slate for starting the editing code anew -Moves code out of WebCore -If other browser vendors choose to expose the same APIs, then we can share the editing library and make the world better for web developers Cons: -Potentially slower since DOM calls are now JS-->C++ -Potential for regressions due to holes in the layout test coverage -Not statically typed I'm not too concerned about the perf hit. It should be no more than a constant-factor and, historically, the editing perf problems have been order-of-magnitude issues. As for the functionality regressions, I think they're inevitable. We'd hit most of the same issues trying to refactor the existing C++ code on top of better APIs. The best we can hope for is to write extra tests for each bit that we port over. We don't need to do this as one large atomic swap. We can port one command at a time over to keep this as incremental changes. There's also the question of whether we should expose the library code to JS. But that's orthogonal really. I think we shouldn't to start with not exposing the code. We can always decide to expose it later if we think it's worthwhile. What do you think? Ojan
On Aug 3, 2010, at 4:32 PM, Ojan Vafai wrote:
Pros: -Ensures that the APIs we expose to the web are at least good enough for our own editing code -Ensures that editing code never crashes (outside of JSC/V8 bugs) -Gives a clean slate for starting the editing code anew -Moves code out of WebCore -If other browser vendors choose to expose the same APIs, then we can share the editing library and make the world better for web developers
Cons: -Potentially slower since DOM calls are now JS-->C++ -Potential for regressions due to holes in the layout test coverage -Not statically typed
I am more interested in what these new APIs would be that we’d rebuild editing on top of. Using JavaScript as the programming language doesn’t seem so great, but I’m not as passionate about this as I am about devising good abstractions to build editing on top of. Coming up with the abstraction that lets us build efficient editing code seems like a challenge. Programming language has little to do with it. We’ve had lots of performance problems with editing code. Inventing a new layer to rebuild editing on top could well be good. Exposing that layer itself to webpages seems like it makes the job even harder rather than easier! Hidden implementation details can be changed more easily than exposed APIs. I personally don’t think a complete rewrite is a great idea, nor do I think that using JavaScript is how I’d do it. -- Darin
On Aug 3, 2010, at 4:38 PM, Darin Adler wrote:
Inventing a new layer to rebuild editing on top could well be good. Exposing that layer itself to webpages seems like it makes the job even harder rather than easier! Hidden implementation details can be changed more easily than exposed APIs.
I personally don’t think a complete rewrite is a great idea, nor do I think that using JavaScript is how I’d do it.
I strongly agree with these points. In addition I'd add, there seem to be multiple large changes proposed under the "complete rewrite" heading: (1) A new editing API exposed to Web content. (2) A new set of fundamental abstractions to build editing on top of (which maybe has to be the same as #1?) (3) A change in implementation language for much of the editing code from C++ to JavaScript. (4) A from-scratch rewrite of the whole editing subsystem, rather than an incremental refactoring. Each of these seems like a very high-risk project. Doing all four at once seems to put the risk into the red zone. My idea of how to make these kinds of changes would be to do one at a time, and probably not do #3 or #4 at all. Rewriting is almost never a better option than refactoring. Regards, Maciej
On Aug 3, 2010, at 4:32 PM, Ojan Vafai wrote:
At the editing meeting at the WebKit conference in April, we discussed the idea of coming up with a replacement to Position/Range, using that throughout the editing code and then eventually exposing that API to the web to supersede DOM Ranges.
I assume you plan on maintaining support for the DOM Range API? <http://www.w3.org/TR/DOM-Level-2-Traversal-Range/ranges.html> Simon
Some of us had a somewhat crazy idea to rewrite much of the editing code (e.g. document.execCommand) in JavaScript.
Pros:
-Ensures that the APIs we expose to the web are at least good enough for our own editing code
I don't think this necessarily follows. Not everything exposed to the internal editing implementation would necessarily be exposed to the web. If we required that everything exposed to the internal editing implementation be exposed to the web, that would substantially slow development, since every new API would need to be vetted and possibly standardized. So this is either not true or a substantial con.
-Ensures that editing code never crashes (outside of JSC/V8 bugs)
JavaScript can still crash -- you just get an unhandled exception instead of a segfault. It's not clear to me why that would be better. I can think of reasons why it would be worse: - Can't use standard OS tools like CrashReporter to detect problem areas. - Harder to debug, since you need to use the Web Inspector, which: - doesn't have all the features of modern C++ debuggers, like watchpoints and breakpoint commands - creates a circular dependency - Sometimes, instead of an unhandled exception, you'll just get incorrect behavior that's very hard to track down. A similar set of cons pertains to performance issues.
-Gives a clean slate for starting the editing code anew
This is an argument for a rewrite, not an argument for JavaScript. A rewrite can happen in any language. A rewrite is not self-evidently a good thing.
-Moves code out of WebCore
Changing the language doesn't move the code out of WebCore. Moving code out of WebCore is not self-evidently a good thing.
-If other browser vendors choose to expose the same APIs, then we can share the editing library and make the world better for web developers
That's a big if. Do you have any evidence that other vendors are interested? Are there vendors specifically interested in adopting WebKit's editing library, but not WebKit as a whole? That would surprise me.
Cons: -Potentially slower since DOM calls are now JS-->C++ -Potential for regressions due to holes in the layout test coverage -Not statically typed
I notice that you don't mention the added complexity of gluing two languages together for core DOM operations. I think that's probably the main con.
I'm not too concerned about the perf hit. It should be no more than a constant-factor and, historically, the editing perf problems have been order-of-magnitude issues.
You're not considering the hurt that the editing JavaScript code could put on website code. If the editing memory footprint is large, the GC hit on other websites could be substantial. JavaScript's scoping rules also have a nasty tendency to introduce accidental memory references that keep large object graphs alive, exacerbating this problem. For security reasons, we might need to instantiate a new copy of the editing code for every webpage. That could be a substantial memory use regression.
As for the functionality regressions, I think they're inevitable. We'd hit most of the same issues trying to refactor the existing C++ code on top of better APIs.
I agree that a rewrite inevitably introduces a large number of bugs, regardless of whether it happens in C++ or JavaScript. However, I don't agree that refactoring inevitably introduces just as many bugs as rewriting. I would submit that the entire history of the WebKit project demonstrates the value of refactoring over rewriting. Geoff
On Tue, Aug 3, 2010 at 5:07 PM, Geoffrey Garen <ggaren@apple.com> wrote:
Some of us had a somewhat crazy idea to rewrite much of the editing code (e.g. document.execCommand) in JavaScript.
Pros:
-Ensures that the APIs we expose to the web are at least good enough for our own editing code
I don't think this necessarily follows. Not everything exposed to the internal editing implementation would necessarily be exposed to the web. If we required that everything exposed to the internal editing implementation be exposed to the web, that would substantially slow development, since every new API would need to be vetted and possibly standardized. So this is either not true or a substantial con.
It's not like that's a serial process. More to the point, assuming the current command API is the result (as proposed) and that DOM ranges are supported, it's unclear that standardization is a major risk. In any case, we won't get better APIs if we don't try and the current APIs suck hard.
-Ensures that editing code never crashes (outside of JSC/V8 bugs)
JavaScript can still crash -- you just get an unhandled exception instead of a segfault. It's not clear to me why that would be better. I can think of reasons why it would be worse:
These crashes are much less likely to be exploitable security issues. That's one (major) plus.
- Can't use standard OS tools like CrashReporter to detect problem areas. - Harder to debug, since you need to use the Web Inspector, which: - doesn't have all the features of modern C++ debuggers, like watchpoints and breakpoint commands - creates a circular dependency - Sometimes, instead of an unhandled exception, you'll just get incorrect behavior that's very hard to track down.
A similar set of cons pertains to performance issues.
I'm not sure that's clearly true.
-Gives a clean slate for starting the editing code anew
This is an argument for a rewrite, not an argument for JavaScript. A rewrite can happen in any language.
A rewrite is not self-evidently a good thing.
Nobody suggested it was? The basis for the suggestion is: * today's JS APIs are not fit for the tasks that are being asked of them * the command system isn't extensible from JS, causing incredible amounts of hackery and rework in every JS wrapper for editors * when things go south in the current system, crashes are potentially exploitable
-Moves code out of WebCore
Changing the language doesn't move the code out of WebCore.
Moving code out of WebCore is not self-evidently a good thing.
-If other browser vendors choose to expose the same APIs, then we can share the editing library and make the world better for web developers
That's a big if. Do you have any evidence that other vendors are interested? Are there vendors specifically interested in adopting WebKit's editing library, but not WebKit as a whole? That would surprise me.
Fair enough, but consider the case of a rich text editing system as exists in products like GMail and Mobile Me where the current infrastructure is as much a liability as an asset. For those systems, having better plumbing and being able to operate on more deterministic, low-level APIs for editing would be a serious plus.
Cons: -Potentially slower since DOM calls are now JS-->C++ -Potential for regressions due to holes in the layout test coverage -Not statically typed
I notice that you don't mention the added complexity of gluing two languages together for core DOM operations. I think that's probably the main con.
The bindings are already opaque. How is this really worse?
I'm not too concerned about the perf hit. It should be no more than a constant-factor and, historically, the editing perf problems have been order-of-magnitude issues.
You're not considering the hurt that the editing JavaScript code could put on website code. If the editing memory footprint is large, the GC hit on other websites could be substantial.
JavaScript's scoping rules also have a nasty tendency to introduce accidental memory references that keep large object graphs alive, exacerbating this problem.
Luckily Ojan et. al. happen to be very good at JavaScript ;-) Regards
For security reasons, we might need to instantiate a new copy of the editing code for every webpage. That could be a substantial memory use regression.
As for the functionality regressions, I think they're inevitable. We'd hit most of the same issues trying to refactor the existing C++ code on top of better APIs.
I agree that a rewrite inevitably introduces a large number of bugs, regardless of whether it happens in C++ or JavaScript.
However, I don't agree that refactoring inevitably introduces just as many bugs as rewriting. I would submit that the entire history of the WebKit project demonstrates the value of refactoring over rewriting.
Geoff _______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
we won't get better APIs if we don't try and the current APIs suck hard.
Better APIs don't require a rewrite in a new language. As I just said, a rewrite in a new language seems to be a distraction from better APIs.
-Ensures that editing code never crashes (outside of JSC/V8 bugs)
JavaScript can still crash -- you just get an unhandled exception instead of a segfault. It's not clear to me why that would be better. I can think of reasons why it would be worse:
These crashes are much less likely to be exploitable security issues. That's one (major) plus.
Have security issues in editing code been a big attack vector? Bigger than other vectors that we're not considering rewriting in JavaScript?
- Can't use standard OS tools like CrashReporter to detect problem areas. - Harder to debug, since you need to use the Web Inspector, which: - doesn't have all the features of modern C++ debuggers, like watchpoints and breakpoint commands - creates a circular dependency - Sometimes, instead of an unhandled exception, you'll just get incorrect behavior that's very hard to track down.
A similar set of cons pertains to performance issues.
I'm not sure that's clearly true.
Could you elaborate? How would SpinTracer integrate with JavaScript editing? How about Shark? Are you unconcerned about all the debugging tools we would lose?
For those systems, having better plumbing and being able to operate on more deterministic, low-level APIs for editing would be a serious plus.
Once again, this is a pro to better editing abstractions, not a pro to a rewrite, nor a pro to JavaScript.
I notice that you don't mention the added complexity of gluing two languages together for core DOM operations. I think that's probably the main con.
The bindings are already opaque. How is this really worse?
More glue is more con. Also, we've never used glue to implement core DOM operations before. We've only used JS to wrap the DOM before.
I'm not too concerned about the perf hit. It should be no more than a constant-factor and, historically, the editing perf problems have been order-of-magnitude issues.
You're not considering the hurt that the editing JavaScript code could put on website code. If the editing memory footprint is large, the GC hit on other websites could be substantial.
JavaScript's scoping rules also have a nasty tendency to introduce accidental memory references that keep large object graphs alive, exacerbating this problem.
Luckily Ojan et. al. happen to be very good at JavaScript ;-)
Are they correspondingly bad at C++? If so, that might be an argument for using JavaScript instead of C++, but a better solution would probably be to find engineers who are good at C++. Geoff
participants (6)
-
Alex Russell
-
Darin Adler
-
Geoffrey Garen
-
Maciej Stachowiak
-
Ojan Vafai
-
Simon Fraser