[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Re: Discover data patterns or Create data patterns?
"Michael Kay" <mike@s...> wrote in message 8D80A281FA3E4ED5B92DC5AFF2BA16F8@Sealion">news:8D80A281FA3E4ED5B92DC5AFF2BA16F8@Sealion... >> A good example of a problem that can be completely >> eliminated: use a Finger-Tree-based sequence for all >> sequences, then there is no need to worry to convert from one >> sequence type to another (of course, the items still need to >> be of the same type). Not to mention the gains in efficiency. >> > > Well, perhaps it was reading your blog that triggered me to use this > example. But I don't believe it's generally the case that for any given > interface, there is one implementation that will be optimum (or even "good > enough") across all usage patterns. I am sorry I couldn't find free time to reply to a similar comment on my blog, hope now I am replying to it, too. I share your opinion. Even if a finger-tree-based collection is good enough in 80% of the cases, then it would be a useful strategy to have this implementation as the basic (default) one at the start and then replace it with something else only in cases where this is proven necessary. I believe that in most cases the finger tree will be satisfactory enough. Plus the added benefit of not having to maintain and think of different collection implementations. Probably in many applications we'll not need to substitute the finger tree with something else. Only in special cases such need will be felt and justified. > > On the specific question of the fingertree and its possible use as an > implementation of XPath 2.0 sequences in memory, I'm a little sceptical. > Firstly, XSLT/XPath implementations try to avoid storing sequences in memory > wherever possible, relying instead on lazy evaluation and pipelining. The finger-tree data structure allows a lazy implementation, too. In fact, I haven't done this in a completely lazy way: just to leave some more fun for the future and more seriously, because I did doubt a completely lazy implementation would result in more significant performance gains. One way to add more laziness is not to construct the "innerFT" finger tree until some future operation doesn't require this. Because of the recursive definition of this structure, laziness is possible at any level of recursion. > Secondly, where operations are performed on sequences, the original input > usually comes from a path expression - that is, an iterator over nodes in an > XML tree. In your performance measurements, to show the superiority of this > data structure, you had to suppress Saxon's optimizations by writing > obfuscatory code to defeat the optimizer. I would be more interested if you > could show that real programs benefit even when you allow the optimizer (and > lazy evaluation / pipelining) to do its work. That will be the case whenever there are repeated append, prepend, concat, insert, delete and split operations on a sequence. Without an optimized data structure the complexity of such operations is usually O(N^2). With a finger tree the complexity is either O(1) (for prepend and append), or sublinear. I will be interested to perform such an experiment with real XSLT applications (the two Sudoku solvers and the FXSL LR parsing system naturally come to mind) whenever I find some free time. An ordered collection implemented on a finger tree can be naturally used in XPath operations -- it will sort and dedublicate nodes without the need for any additional code. > > You also apply the data structure to storage of strings. For string > operations (such as substring and concat), Saxon isn't currently using lazy > evaluation or pipelining; but it could, and my instinct is that this could > match the benefits you are seeing for the kind of application you are using > as a test case. This would be interesting to see. The string tests were one of the very few in which Saxon wasn't the best of the three tested XSLT processors. Even the one that obviously used a directly addressable char array was significantly slower than the finger-tree-based string. Of course, the performance gains can be demonstrated and can be useful on collections/strings of moderate or big length; a collection of a few items does not need any optimization. > > (Reading this through, it sounds too dismissive. I'm not rejecting the idea, > I'm just expressing initial scepticism). > > Michael Kay Any false enthusiasm would be really harmful, your interest and analysis are truly valuable. Thanks, Dimitre Novatchev > > > > _______________________________________________________________________ > > XML-DEV is a publicly archived, unmoderated list hosted by OASIS > to support XML implementation and development. To minimize > spam in the archives, you must subscribe before posting. > > [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/ > Or unsubscribe: xml-dev-unsubscribe@l... > subscribe: xml-dev-subscribe@l... > List archive: http://lists.xml.org/archives/xml-dev/ > List Guidelines: http://www.oasis-open.org/maillists/guidelines.php > >
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|