[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: joining parallel markup


two trees that are parallel
On 3/18/06, Oleg A. Paraschenko <olpa@x...> wrote:
<snip/>
> >
> > I haven't given it much thought,  but it seems that even the general
> > case of merging non-structurally identical trees is NP complete.
> > Consider the fact that identically named nodes may lie on different
> > branches.  The only way to evaluate whether two such nodes can in fact
> > be merged is to look at the relationship of such nodes to every other
> > node on the tree and decide if they match (whatever "match" might mean
> > in this case).  As such, this looks to be a classic travelling
> > salesman traversal...
>
> Either we refer to different meanings of "join markup", either our
> approaches are very different.
>
> My algorithm is quite straightforward. I have two trees. One is "stable"
> (order and hierarchy of the elements isn't changes), another one is "weak"
> (the elements can be fused).
>
> I traverse the first tree, visiting the character data chunks which are
> continuous (only text, no tags) in the both trees. Each chunk is wrapped
> by its context from the second tree.
>
> An example.
>
> first tree: <x><b>some dummy</b> text</x>
> second tree: <x>some <i>dummy text</i></x>
>
> There are three chunks:
>
> "some "
> "dummy"
> " text"
>
> The corresponding contexts:
>
> /x
> /x/i
> /x/i
>
> So the expected result is:
>
> <x><b><x>some </x><x><i>dummy</i></x></b><x><i> text</i></x></x>
>
> Fortunately, in many situations, it's very easy to make the code smart to
> simplify contexts. In this examples, simplified contexts are <empty>, "i",
> and "i", and the better result is:
>
> <x><b>some <i>dummy</i></b><i> text</i></x>

If I understand correctly you are matching only on text nodes and the
basic textual content is always the same between the two trees?   If
so, then in general this cannot be completely defined, there could be
ambiguity in the resultant element matching.

Eg, (somewhat artificial):

    This is a <ol><li><ul><li>problem.</li></ul></li></ol>

    This is a <li class="bad">problem.</li>

However, I'd guess that in practice this works just fine most of the
time.  Unfortunately, I don't see any utility for us...

--
Peter Hunsberger

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.