[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Find the number of elements that are prior to the
On Tue, Mar 12, 2019 at 1:31 PM Michael Kay mike@xxxxxxxxxxxx <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: > > In performance terms, of course, a solution using keys incurs a one-time cost to build indexes, which is amortized over a number of searches; so the performance benefits depend on whether the search is done once, or repeatedly. > > If there are "hundreds of thousands" of Byte elements, the space used by the index can become a significant factor. > > Also, a solution using keys will almost inevitably process the whole input sequence. Other solutions are likely to stop reading the input sequence when the first match is found. Wouldn't "hundreds of thousands" of Byte elements be also a (both performance and) space problem when using index-of() -- especially if a starting subsequence of the searched string happens regularly in the document? Also RegEx matching becomes not so efficient with large strings, too. It would be nice if a particular XSLT processor implements lazily the index-of() function, allowing its partial results to be immediately accessible to the evaluator of the whole XPath expression -- in this case only, the solution using index-of() will stop immediately after the successful index value is produced -- otherwise the evaluator will need to perform on the complete index. Of course, exactly the same reasoning applies to the implementation of <xsl:key> and the key() function, if a smart enough XSLT processor allows the key() function to operate on a partially-built index, and when a solution is found the building of the index is halted. So, **depending on implementation details**, the other solutions **may** have a better average performance for a single search. With multiple searches though, using keys will be more performant. Also, depending on where the sought substring is located for the first time in the document, if it isn't found at all or is at the end of the sequence of Byte elements, then exhaustive string scanning will be performed with these other solutions too. If we accept that an "average" search will involve scanning half of the string, then if 3 or more searches are performed, the key-based solution will have incurred the cost of processing the document just once, while the accumulated string scanning of the other solutions will be over greater (accumulated) length of sequences of Byte elements. Maybe we need a performance comparison of the different solutions, over different source XML documents that cover these cases? Cheers, Dimitre > > Michael Kay > Saxonica > > On 12 Mar 2019, at 20:20, Dimitre Novatchev dnovatchev@xxxxxxxxx <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: > > Here is an XSLT 2.0 solution using keys. In its current form it > returns the wanted offset (length) as a hexadecimal number. To convert > this to a decimal number, have a look at the archives where AFAIR > there is such code -- could be even posted by me ... This conversion > to decimal can certainly be written elegantly as fold-left() in > XSLT/XPath 3 > > > XSL-List info and archive > EasyUnsubscribe (by email) -- Cheers, Dimitre Novatchev --------------------------------------- Truly great madness cannot be achieved without significant intelligence. --------------------------------------- To invent, you need a good imagination and a pile of junk ------------------------------------- Never fight an inanimate object ------------------------------------- To avoid situations in which you might make mistakes may be the biggest mistake of all ------------------------------------ Quality means doing it right when no one is looking. ------------------------------------- You've achieved success in your field when you don't know whether what you're doing is work or play ------------------------------------- To achieve the impossible dream, try going to sleep. ------------------------------------- Facts do not cease to exist because they are ignored. ------------------------------------- Typing monkeys will write all Shakespeare's works in 200yrs.Will they write all patents, too? :) ------------------------------------- Sanity is madness put to good use. ------------------------------------- I finally figured out the only reason to be alive is to enjoy it.
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|