Re: [webkit-changes] [24723] trunk/WebCore
I'm not happy with the code organization here. XMLTokenizer now has tons of ifdefs and two separate implementations. It's fine to have a QXmlStream implementation, but the two implementations should be in separate files, side by side, as we do in the platform layer. Or the XML parser should be abstracted out so we have a single XMLTokenizer built on a single parser abstraction. I think this is also the type of patch that should be reviewed by at least one person other than the folks concentrating on the Qt port. With the huge number of check-ins today, I'm sure there are other interesting things like this that I missed. -- Darin
On Jul 27, 2007, at 7:50 AM, Darin Adler wrote:
With the huge number of check-ins today, I'm sure there are other interesting things like this that I missed.
I looked over the other check-ins, and I think I was wrong about this -- my only real concerns are the UChar/JChar one and the XML ifdefs. -- Darin
On Friday 27 July 2007 16:50:01 Darin Adler wrote:
I'm not happy with the code organization here. XMLTokenizer now has tons of ifdefs and two separate implementations. It's fine to have a QXmlStream implementation, but the two implementations should be in separate files, side by side, as we do in the platform layer.
I agree that the #ifdef's are not optimal, and we can move parts of the code out. But currently a lot of code is still shared between the QXmlStream and the libxml based implementations, so we have two choices: 1. just add an XMLTokenizerQt.cpp and duplicate lots of code. 2. add a qt/XMLTokenizerQt.cpp and a libxml/XMLTokenizerLibXml.cpp and keep a common XMLTokenizer.cpp for code that is used in both. 3. Have some ifdef's in the one file. The first option is no good, as we'd have to duplicate lot's of shared code leading to maintenance issues in the long term. The second option is better, but also here we'd have quite a bit more duplicated code than required as long as we don't do some bigger refactoring to abstract out the common parts. This is something we wanted to avoid at the moment. So we went for option 3.
Or the XML parser should be abstracted out so we have a single XMLTokenizer built on a single parser abstraction.
That will be IMO almost impossible to do in an efficient way as the two parsers work very differently.
I think this is also the type of patch that should be reviewed by at least one person other than the folks concentrating on the Qt port.
We've had other people look at most patches. The QXmlStream changes are required, and I think the only issue to discuss here could be separating implementations. This is something I'd like to do in the longer term, but as you're trying to stabilize currently I didn't feel that this should have been done right now.
With the huge number of check-ins today, I'm sure there are other interesting things like this that I missed.
I think we did a thorough job reviewing our patches and testing them on all platforms (Mac, Windows, gtk and Qt). We also kept most changes rather small and atomic to make reviewing easier. Lars
On Jul 27, 2007, at 11:36 AM, Lars Knoll wrote:
2. add a qt/XMLTokenizerQt.cpp and a libxml/XMLTokenizerLibXml.cpp and keep a common XMLTokenizer.cpp for code that is used in both.
I like that option best. It's the pattern used in platform for cases like this. -- Darin
On Friday 27 July 2007 20:50:53 Darin Adler wrote:
On Jul 27, 2007, at 11:36 AM, Lars Knoll wrote:
2. add a qt/XMLTokenizerQt.cpp and a libxml/XMLTokenizerLibXml.cpp and keep a common XMLTokenizer.cpp for code that is used in both.
I like that option best. It's the pattern used in platform for cases like this.
I'm fine with moving to this approach (even though it'll still lead to some code duplication if we do it the easy way without refactoring). Should we move the XMLTokenizer class to WebCore/platform then? Lars
On Jul 27, 2007, at 11:53 AM, Lars Knoll wrote:
I'm fine with moving to this approach (even though it'll still lead to some code duplication if we do it the easy way without refactoring).
I don't think we should insist on doing it without refactoring. It seems good to add private member functions as necessary so we can share as much of the code as possible. The reason I'm particularly sensitive on this issue is that fixing the structure of ifdef'd code like this is something we've spent a lot of time on the last two years. We still have quite a bit left to fix from decisions I regret when adapting the code to Mac OS X, and I'd like to avoid introducing new cases of it now.
Should we move the XMLTokenizer class to WebCore/platform then?
No. If we were making an independent XML abstraction that didn't depend on the rest of WebKit then it would belong there. But since we've decided to not go that way, this is just platform-specific code in another subdirectory, which we do as needed. See the loader directory, for example. I'm not sure I like the filename XMLTokenizerLibXml.cpp, but I can't think of anything better. -- Darin
On Friday 27 July 2007 21:05:00 Darin Adler wrote:
On Jul 27, 2007, at 11:53 AM, Lars Knoll wrote:
I'm fine with moving to this approach (even though it'll still lead to some code duplication if we do it the easy way without refactoring).
I don't think we should insist on doing it without refactoring. It seems good to add private member functions as necessary so we can share as much of the code as possible.
Good, I was just under the impression that you wanted to avoid these kind of changes at the moment. I'll prepare a patch then.
The reason I'm particularly sensitive on this issue is that fixing the structure of ifdef'd code like this is something we've spent a lot of time on the last two years. We still have quite a bit left to fix from decisions I regret when adapting the code to Mac OS X, and I'd like to avoid introducing new cases of it now.
I agree. We mostly try to avoid these things for our platform specific code in Qt as well.
Should we move the XMLTokenizer class to WebCore/platform then?
No.
If we were making an independent XML abstraction that didn't depend on the rest of WebKit then it would belong there. But since we've decided to not go that way, this is just platform-specific code in another subdirectory, which we do as needed. See the loader directory, for example.
Ok.
I'm not sure I like the filename XMLTokenizerLibXml.cpp, but I can't think of anything better.
I'd be happer to find a better name, but I couldn't think of one. Cheers, Lars
On Friday 27 July 2007 21:15:27 Lars Knoll wrote:
On Friday 27 July 2007 21:05:00 Darin Adler wrote:
On Jul 27, 2007, at 11:53 AM, Lars Knoll wrote:
I'm fine with moving to this approach (even though it'll still lead to some code duplication if we do it the easy way without refactoring).
I don't think we should insist on doing it without refactoring. It seems good to add private member functions as necessary so we can share as much of the code as possible.
Good, I was just under the impression that you wanted to avoid these kind of changes at the moment. I'll prepare a patch then.
Patch is in http://bugs.webkit.org/show_bug.cgi?id=14791 . Please review. Cheers, Lars
On Jul 27, 2007, at 11:36 AM, Lars Knoll wrote:
On Friday 27 July 2007 16:50:01 Darin Adler wrote:
I'm not happy with the code organization here. XMLTokenizer now has tons of ifdefs and two separate implementations. It's fine to have a QXmlStream implementation, but the two implementations should be in separate files, side by side, as we do in the platform layer.
I agree that the #ifdef's are not optimal, and we can move parts of the code out. But currently a lot of code is still shared between the QXmlStream and the libxml based implementations, so we have two choices:
1. just add an XMLTokenizerQt.cpp and duplicate lots of code. 2. add a qt/XMLTokenizerQt.cpp and a libxml/XMLTokenizerLibXml.cpp and keep a common XMLTokenizer.cpp for code that is used in both. 3. Have some ifdef's in the one file.
The first option is no good, as we'd have to duplicate lot's of shared code leading to maintenance issues in the long term.
The second option is better, but also here we'd have quite a bit more duplicated code than required as long as we don't do some bigger refactoring to abstract out the common parts. This is something we wanted to avoid at the moment. So we went for option 3.
Other organizations have requested the ability to use other XML parsers as well, such as expat. Seems like in the long run we want a different approach than just ifdefs in the XMLTokenizer.cpp file. It seems like the best would be some abstraction layer on top of the parser library, but if that is difficult then your option #2 sounds like a docent long-run approach. I would have expected just about every XML parsing library to have a SAX-like API, which shouldn't be too hard to abstract, but perhaps QXml works differently. Regards, Maciej
Or the XML parser should be abstracted out so we have a single XMLTokenizer built on a single parser abstraction.
That will be IMO almost impossible to do in an efficient way as the two parsers work very differently.
I think this is also the type of patch that should be reviewed by at least one person other than the folks concentrating on the Qt port.
We've had other people look at most patches. The QXmlStream changes are required, and I think the only issue to discuss here could be separating implementations. This is something I'd like to do in the longer term, but as you're trying to stabilize currently I didn't feel that this should have been done right now.
With the huge number of check-ins today, I'm sure there are other interesting things like this that I missed.
I think we did a thorough job reviewing our patches and testing them on all platforms (Mac, Windows, gtk and Qt). We also kept most changes rather small and atomic to make reviewing easier.
Lars _______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo/webkit-dev
Didn't KDOM have an XML parsing abstraction? Would that be worth studying? dave On Jul 27, 2007, at 3:26 PM, Maciej Stachowiak wrote:
On Jul 27, 2007, at 11:36 AM, Lars Knoll wrote:
On Friday 27 July 2007 16:50:01 Darin Adler wrote:
I'm not happy with the code organization here. XMLTokenizer now has tons of ifdefs and two separate implementations. It's fine to have a QXmlStream implementation, but the two implementations should be in separate files, side by side, as we do in the platform layer.
I agree that the #ifdef's are not optimal, and we can move parts of the code out. But currently a lot of code is still shared between the QXmlStream and the libxml based implementations, so we have two choices:
1. just add an XMLTokenizerQt.cpp and duplicate lots of code. 2. add a qt/XMLTokenizerQt.cpp and a libxml/XMLTokenizerLibXml.cpp and keep a common XMLTokenizer.cpp for code that is used in both. 3. Have some ifdef's in the one file.
The first option is no good, as we'd have to duplicate lot's of shared code leading to maintenance issues in the long term.
The second option is better, but also here we'd have quite a bit more duplicated code than required as long as we don't do some bigger refactoring to abstract out the common parts. This is something we wanted to avoid at the moment. So we went for option 3.
Other organizations have requested the ability to use other XML parsers as well, such as expat. Seems like in the long run we want a different approach than just ifdefs in the XMLTokenizer.cpp file. It seems like the best would be some abstraction layer on top of the parser library, but if that is difficult then your option #2 sounds like a docent long-run approach. I would have expected just about every XML parsing library to have a SAX-like API, which shouldn't be too hard to abstract, but perhaps QXml works differently.
Regards, Maciej
Or the XML parser should be abstracted out so we have a single XMLTokenizer built on a single parser abstraction.
That will be IMO almost impossible to do in an efficient way as the two parsers work very differently.
I think this is also the type of patch that should be reviewed by at least one person other than the folks concentrating on the Qt port.
We've had other people look at most patches. The QXmlStream changes are required, and I think the only issue to discuss here could be separating implementations. This is something I'd like to do in the longer term, but as you're trying to stabilize currently I didn't feel that this should have been done right now.
With the huge number of check-ins today, I'm sure there are other interesting things like this that I missed.
I think we did a thorough job reviewing our patches and testing them on all platforms (Mac, Windows, gtk and Qt). We also kept most changes rather small and atomic to make reviewing easier.
Lars _______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo/webkit-dev
_______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo/webkit-dev
Hi, Yes, we had multiple xml parsing backends and were pleased with it at the time. The KDOM code is here: http://websvn.kde.org/trunk/kdenonbeta/kdom/ The relevant code is in backends/ and parser/. You can request a parser using KDOMParserFactory depending on the requirements (qdom couldnt handle all options). Every parser has access to the same DocumentBuilder to build the Document, the builder basically builds using the normal SAX2 calls. The builder keeps track of the node stack too while the doc is built. The backends just call these methods and do not need to keep track of the document. Both sync and async are supported, at least for libxml2. IIRC it was partly inspired by DocumentBuilder being in DOM3 (LS?) spec. The parameter setting parts are KDE specific and the architecture may be too complex (I think the parsers are dynamically loaded) so I don't know if it fits, but hopefully it can be a reference. Cheers, Rob. On 28/07/07, David Hyatt <hyatt@apple.com> wrote:
Didn't KDOM have an XML parsing abstraction? Would that be worth studying?
dave
On Saturday 28 July 2007 00:26:19 Maciej Stachowiak wrote:
On Jul 27, 2007, at 11:36 AM, Lars Knoll wrote:
On Friday 27 July 2007 16:50:01 Darin Adler wrote:
I'm not happy with the code organization here. XMLTokenizer now has tons of ifdefs and two separate implementations. It's fine to have a QXmlStream implementation, but the two implementations should be in separate files, side by side, as we do in the platform layer.
I agree that the #ifdef's are not optimal, and we can move parts of the code out. But currently a lot of code is still shared between the QXmlStream and the libxml based implementations, so we have two choices:
1. just add an XMLTokenizerQt.cpp and duplicate lots of code. 2. add a qt/XMLTokenizerQt.cpp and a libxml/XMLTokenizerLibXml.cpp and keep a common XMLTokenizer.cpp for code that is used in both. 3. Have some ifdef's in the one file.
The first option is no good, as we'd have to duplicate lot's of shared code leading to maintenance issues in the long term.
The second option is better, but also here we'd have quite a bit more duplicated code than required as long as we don't do some bigger refactoring to abstract out the common parts. This is something we wanted to avoid at the moment. So we went for option 3.
Other organizations have requested the ability to use other XML parsers as well, such as expat. Seems like in the long run we want a different approach than just ifdefs in the XMLTokenizer.cpp file. It seems like the best would be some abstraction layer on top of the parser library, but if that is difficult then your option #2 sounds like a docent long-run approach. I would have expected just about every XML parsing library to have a SAX-like API, which shouldn't be too hard to abstract, but perhaps QXml works differently.
I guess that assumption doesn't hold. QXmlStream is a streaming parser with an API that is very different from SAX. It IMO a whole lot simpler to use than a SAX like API and is inspired from similar APIs in the Java world. If you're interested, have a look at http://doc.trolltech.com/4.3/qxmlstreamreader.html Cheers, Lars
On Jul 28, 2007, at 3:52 AM, Lars Knoll wrote:
On Saturday 28 July 2007 00:26:19 Maciej Stachowiak wrote:
On Jul 27, 2007, at 11:36 AM, Lars Knoll wrote:
Other organizations have requested the ability to use other XML parsers as well, such as expat. Seems like in the long run we want a different approach than just ifdefs in the XMLTokenizer.cpp file. It seems like the best would be some abstraction layer on top of the parser library, but if that is difficult then your option #2 sounds like a docent long-run approach. I would have expected just about every XML parsing library to have a SAX-like API, which shouldn't be too hard to abstract, but perhaps QXml works differently.
I guess that assumption doesn't hold. QXmlStream is a streaming parser with an API that is very different from SAX. It IMO a whole lot simpler to use than a SAX like API and is inspired from similar APIs in the Java world. If you're interested, have a look at http://doc.trolltech.com/4.3/qxmlstreamreader.html
I'm told libxml has a StreamReader-style API now as well, so if that's the better alternative, we could design the XML code around that style of API (though probably not right at the moment). Cheers, Maciej
On Sunday 29 July 2007 08:47:50 Maciej Stachowiak wrote:
On Jul 28, 2007, at 3:52 AM, Lars Knoll wrote:
On Saturday 28 July 2007 00:26:19 Maciej Stachowiak wrote:
On Jul 27, 2007, at 11:36 AM, Lars Knoll wrote:
Other organizations have requested the ability to use other XML parsers as well, such as expat. Seems like in the long run we want a different approach than just ifdefs in the XMLTokenizer.cpp file. It seems like the best would be some abstraction layer on top of the parser library, but if that is difficult then your option #2 sounds like a docent long-run approach. I would have expected just about every XML parsing library to have a SAX-like API, which shouldn't be too hard to abstract, but perhaps QXml works differently.
I guess that assumption doesn't hold. QXmlStream is a streaming parser with an API that is very different from SAX. It IMO a whole lot simpler to use than a SAX like API and is inspired from similar APIs in the Java world. If you're interested, have a look at http://doc.trolltech.com/4.3/qxmlstreamreader.html
I'm told libxml has a StreamReader-style API now as well, so if that's the better alternative, we could design the XML code around that style of API (though probably not right at the moment).
No, for the moment, I'd rather just go with the approach I've posted in bug 14791. Once there are requests for more parser backends, we could rethink this, but for now I think we have more urgent things to do. Cheers, Lars
Hello, sorry to bring this old topic up again. Nothing has happened in the last year or so, and this issue is bothering me again... I am thinking that libxml2 should be somehow considered "default" since it is used by most existing ports, and also is quite portable, making it likely that new ports will use it. Maybe this idea will make it possible to make a patch that permits using something else than libxml2 without having to patch build systems other than qmake. Would it be possible to use a variety on the ResourceHandle porting approach and have a WebCore/dom/XMLTokenizerBase.{h, cpp}. The libxml2 implementation lives in WebCore/dom/XMLTokenizer.{h, cpp}. If you don't specify in your build system that you should use WebCore/dom/port/XMLTokenizer.h and WebCore/dom/port/XMLTokenizerPort.cpp, you will get libxml2. An added bonus is that you don't have to come up with a better name than "XMLTokenizerLibXml2.cpp" since it's still called "XMLTokenizer.cpp". /Arvid On Sun, Jul 29, 2007 at 7:53 PM, Lars Knoll <lars@trolltech.com> wrote:
On Sunday 29 July 2007 08:47:50 Maciej Stachowiak wrote:
On Jul 28, 2007, at 3:52 AM, Lars Knoll wrote:
On Saturday 28 July 2007 00:26:19 Maciej Stachowiak wrote:
On Jul 27, 2007, at 11:36 AM, Lars Knoll wrote:
Other organizations have requested the ability to use other XML parsers as well, such as expat. Seems like in the long run we want a different approach than just ifdefs in the XMLTokenizer.cpp file. It seems like the best would be some abstraction layer on top of the parser library, but if that is difficult then your option #2 sounds like a docent long-run approach. I would have expected just about every XML parsing library to have a SAX-like API, which shouldn't be too hard to abstract, but perhaps QXml works differently.
I guess that assumption doesn't hold. QXmlStream is a streaming parser with an API that is very different from SAX. It IMO a whole lot simpler to use than a SAX like API and is inspired from similar APIs in the Java world. If you're interested, have a look at http://doc.trolltech.com/4.3/qxmlstreamreader.html
I'm told libxml has a StreamReader-style API now as well, so if that's the better alternative, we could design the XML code around that style of API (though probably not right at the moment).
No, for the moment, I'd rather just go with the approach I've posted in bug 14791. Once there are requests for more parser backends, we could rethink this, but for now I think we have more urgent things to do.
Cheers, Lars _______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo/webkit-dev
On Sep 10, 2008, at 3:08 PM, Arvid Nilsson wrote:
Hello, sorry to bring this old topic up again. Nothing has happened in the last year or so, and this issue is bothering me again...
There is actually a reviewed patch in <https://bugs.webkit.org/show_bug.cgi?id=20437
now, waiting to be committed. You are welcome to try it and to provide comments on this approach.
- WBR, Alexey Proskuryakov
On Wednesday 10 September 2008 14:09:31 Alexey Proskuryakov wrote:
On Sep 10, 2008, at 3:08 PM, Arvid Nilsson wrote:
Hello, sorry to bring this old topic up again. Nothing has happened in the last year or so, and this issue is bothering me again...
There is actually a reviewed patch in <https://bugs.webkit.org/show_bug.cgi?id=20437
now, waiting to be committed. You are welcome to try it and to
provide comments on this approach.
I try to do a macbuild and run the tests there during the weekend and then land it. z.
On Sep 10, 2008, at 4:08 AM, Arvid Nilsson wrote:
Hello, sorry to bring this old topic up again. Nothing has happened in the last year or so, and this issue is bothering me again...
I am thinking that libxml2 should be somehow considered "default" since it is used by most existing ports, and also is quite portable, making it likely that new ports will use it. Maybe this idea will make it possible to make a patch that permits using something else than libxml2 without having to patch build systems other than qmake.
Would it be possible to use a variety on the ResourceHandle porting approach and have a WebCore/dom/XMLTokenizerBase.{h, cpp}. The libxml2 implementation lives in WebCore/dom/XMLTokenizer.{h, cpp}. If you don't specify in your build system that you should use WebCore/dom/port/XMLTokenizer.h and WebCore/dom/port/ XMLTokenizerPort.cpp, you will get libxml2.
An added bonus is that you don't have to come up with a better name than "XMLTokenizerLibXml2.cpp" since it's still called "XMLTokenizer.cpp".
I think we should wrap the XML parser API at the platform/ layer. The main risk in doing so would be performance. - Maciej
/Arvid
On Sun, Jul 29, 2007 at 7:53 PM, Lars Knoll <lars@trolltech.com> wrote: On Sunday 29 July 2007 08:47:50 Maciej Stachowiak wrote:
On Jul 28, 2007, at 3:52 AM, Lars Knoll wrote:
On Saturday 28 July 2007 00:26:19 Maciej Stachowiak wrote:
On Jul 27, 2007, at 11:36 AM, Lars Knoll wrote:
Other organizations have requested the ability to use other XML parsers as well, such as expat. Seems like in the long run we want a different approach than just ifdefs in the XMLTokenizer.cpp file. It seems like the best would be some abstraction layer on top of the parser library, but if that is difficult then your option #2 sounds like a docent long-run approach. I would have expected just about every XML parsing library to have a SAX-like API, which shouldn't be too hard to abstract, but perhaps QXml works differently.
I guess that assumption doesn't hold. QXmlStream is a streaming parser with an API that is very different from SAX. It IMO a whole lot simpler to use than a SAX like API and is inspired from similar APIs in the Java world. If you're interested, have a look at http://doc.trolltech.com/4.3/qxmlstreamreader.html
I'm told libxml has a StreamReader-style API now as well, so if that's the better alternative, we could design the XML code around that style of API (though probably not right at the moment).
No, for the moment, I'd rather just go with the approach I've posted in bug 14791. Once there are requests for more parser backends, we could rethink this, but for now I think we have more urgent things to do.
Cheers, Lars _______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo/webkit-dev
_______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
participants (8)
-
Alexey Proskuryakov
-
Arvid Nilsson
-
Darin Adler
-
David Hyatt
-
Holger Freyther
-
Lars Knoll
-
Maciej Stachowiak
-
Rob Buis