Subject: RE: How Do I Generate A Set-Difference With Context - Part A
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Sun, 13 Mar 2005 16:54:28 -0000
|
> This continues my earlier post, unfortunately unresponded to,
> on the same subject. The original post is here:
> http://www.biglist.com/lists/xsl-list/archives/200503/msg00332.html
>
It's bad tactics to suggest that you're 80% of the way to a solution, and
not show us the 80%. No-one wants to redo the work you've already done on
the off-chance that they'll be able to help you with the final 20%.
That's especially true as it's a difficult problem and one has to do a lot
of guesswork about the requirements. What output would you expect if the two
source documents are:
<a><b/><c/></a>
and
<a><c/><b/></a>
?
>
> The diff quest came from the following problem: I get
> periodic XML "feeds"
> from a news syndicate; these feeds are parsed, formatted in HTML, and
> published on a website. Each feed is an XML file, and
> contains zero or more
> "stories". A story may be exactly like that in the
> immediately-prior feed,
> may be slightly different, or may be completely new. Hence
> my desire to
> "diff" 2 feeds rather than simply regenerate all stories.
> When only, say,
> 20 stories change among 1000+ stories, this is a processing win.
This looks a rather easier problem, because order is irrelevant.
It's fairly easy, I would have thought, to identify a <story> in one file
for which there is no corresponding <story> in the other. Identifying
finer-grained differences seems to require making some assumptions: what if
one story in the second file is similar to two stories in the first file,
but not identical to either?
>
> I used an augmented vset:difference from "XSLT Cookbook" ... I can't
> think of other
> algorithmic improvements to make; if anybody else can, please post.
Sorry, but suggesting improvements to code I haven't seen is beyond my
abilities.
Michael Kay
http://www.saxonica.com/
|