[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: is there a way to hash an element?
On Sat, Jun 11, 2016 at 05:21:09PM -0000, Dimitre Novatchev dnovatchev@xxxxxxxxx scripsit: Hi Dimitre -- > Actually, I believe that calling deep-equal() can be more efficient > than comparing hashes. > > The reason is simple: deep-equal() most probably returns false at the > first possible moment -- for example, noticing that an element has > different attributes than its counterpart. > > On the other side, with hashing, the hashes for the two whole > subtrees have to be calculated and only after that they can be > compared. > > To summarize, with the exception of the case when the two subtrees are > equal, deep-equal may perform faster than generating and comparing > hashes on the subtrees. I've got one input document with ~5000 trees that are mappable to XSD schema definitions; about half are complexTypes. Many are structurally the same but have different names. (All ~5000 have unique names.) The idea is to group them by structural sameness; deep-equal, even very efficiently implemented deep-equal, gives me n^2 as I have to go through the whole tree for each element and ask "are you like me?" pairwise. Some of the equivalent structures will have a lot of matches -- hundreds -- where I can't expect deep-equal to fail quickly and thus efficiently. Going through and decorating every element with its hash value (@hash="something") and then using for-each-group on the lot on the basis of the hash gives me 2n. Even if it's a very naive hash implementation, I'd expect 2n to beat n^2 performance. Am I missing something? (I'll certainly keep deep-equal in mind if the hash approach has unacceptable performance.) -- Graydon
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|