[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: building a hierarchical classification out of flat

Subject: Re: building a hierarchical classification out of flat and redundant data
From: "Albert Juhé" <albertjuhe@xxxxxxxxx>
Date: Tue, 25 Jul 2006 10:01:12 +0200
hierarchial classification
Hi David,

The last week an amazin brown arrive me, the problem is the same:
I have this xml:

<modul>
<unit id="1">
<subunit>Rupturas</subunit>
<sub-subunit>sistema </sub-subunit>
<sub-subunit>incertidumbre</sub-subunit>
<subunit>Megatendencias</subunit>
<sub-subunit>Caracterizacisn</sub-subunit>
<sub-sub-subunit>1.2.1.1.</sub-sub-subunit>
<p>Text 1211</p>
<param>Text 2 1211</param>
<sub-sub-subunit>1.2.1.2.</sub-sub-subunit>
<sub-sub-subunit>1.2.1.3.</sub-sub-subunit>
<sub-subunit>Vectores</sub-subunit>
<sub-sub-subunit>1.2.2.1.</sub-sub-subunit>
<sub-sub-subunit>1.2.2.2.</sub-sub-subunit>
<sub-sub-subunit>1.2.2.3.</sub-sub-subunit>
<subunit>Perspectivas</subunit>
<sub-subunit>Ideologmas</sub-subunit>
<sub-sub-subunit>1.3.1.1.</sub-sub-subunit>
<sub-sub-subunit>1.3.1.2.</sub-sub-subunit>
<sub-subunit>controversia</sub-subunit>
<sub-sub-subunit>1.3.2.1.</sub-sub-subunit>
<sub-sub-subunit>1.3.2.2.</sub-sub-subunit>
</unit>
<unit id="2">
<p>Desafmos sociolaboral</p>
<subunit>Cantidad</subunit>
<p>Text Cantidad</p>
<sub-subunit>riqueza</sub-subunit>
<sub-subunit>paramso</sub-subunit>
<sub-subunit>materia</sub-subunit>
<sub-subunit>panorama a las perspectivas</sub-subunit>
<subunit>Calidad</subunit>
<sub-subunit>Polarizacisn</sub-subunit>
<sub-subunit>La cara</sub-subunit>
<sub-subunit>La cruz</sub-subunit>
<sub-subunit>Precarizacisn</sub-subunit>
<subunit>experiencia</subunit>
<sub-subunit>Ejes</sub-subunit>
<sub-subunit>Condiciones</sub-subunit>
<sub-sub-subunit>2.3.2.1.</sub-sub-subunit>
<sub-sub-subunit>2.3.2.2.</sub-sub-subunit>
<sub-sub-subunit>2.3.2.3.</sub-sub-subunit>
<subunit>paradigma</subunit>
<sub-subunit>civilizacisn</sub-subunit>
<sub-subunit>emplemsmo</sub-subunit>
<sub-subunit>Agenda</sub-subunit>
</unit>
</modul>

And I have to convert in a hierarchial xml structure into the unit
tag, with this conditions:
- Between tag can exists another tags, this tags belongs to the
preceding-sibling.
- The hierarchi is: unit, subunit,sub-subunit and sub-sub-subunit.

Result file and solution:

<modul>
	<unit id="1">
		<subunit>
			<title>Rupturas</title>
			<sub-subunit>
				<title>sistema </title>
			</sub-subunit>
			<sub-subunit>
				<title>incertidumbre</title>
			</sub-subunit>
		</subunit>
		<subunit>
			<title>Megatendencias</title>
			<sub-subunit>
				<title>Caracterizacisn</title>
				<sub-sub-subunit>
					<title>1.2.1.1.</title>
					<p>Text 1211</p>
					<param>Text 2 1211</param>
				</sub-sub-subunit>
				<sub-sub-subunit>
					<title>1.2.1.2.</title>
				</sub-sub-subunit>
				<sub-sub-subunit>
					<title>1.2.1.3.</title>
				</sub-sub-subunit>
			</sub-subunit>
			<sub-subunit>
				<title>Vectores</title>
				<sub-sub-subunit>
					<title>1.2.2.1.</title>
				</sub-sub-subunit>
				<sub-sub-subunit>
					<title>1.2.2.2.</title>
				</sub-sub-subunit>
				<sub-sub-subunit>
					<title>1.2.2.3.</title>
				</sub-sub-subunit>
			</sub-subunit>
		</subunit>
		<subunit>
			<title>Perspectivas</title>
			<sub-subunit>
				<title>Ideologmas</title>
				<sub-sub-subunit>
					<title>1.3.1.1.</title>
				</sub-sub-subunit>
				<sub-sub-subunit>
					<title>1.3.1.2.</title>
				</sub-sub-subunit>
			</sub-subunit>
			<sub-subunit>
				<title>controversia</title>
				<sub-sub-subunit>
					<title>1.3.2.1.</title>
				</sub-sub-subunit>
				<sub-sub-subunit>
					<title>1.3.2.2.</title>
				</sub-sub-subunit>
			</sub-subunit>
		</subunit>
	</unit>
	<unit id="2">
		<p>Desafmos sociolaboral</p>
		<subunit>
			<title>Cantidad</title>
			<p>Text Cantidad</p>
			<sub-subunit>
			<title>riqueza</title>
			</sub-subunit>
			<sub-subunit>
			<title>paramso</title>
			</sub-subunit>
			<sub-subunit>
			<title>materia</title>
		</sub-subunit>
		<sub-subunit>
			<title>panorama a las perspectivas</title>
			</sub-subunit>
		</subunit>
		<subunit>
			<title>Calidad</title>
			<sub-subunit>
				<title>Polarizacisn</title>
			</sub-subunit>
			<sub-subunit>
				<title>La cara</title>
			</sub-subunit>
			<sub-subunit>
				<title>La cruz</title>
			</sub-subunit>
			<sub-subunit>
				<title>Precarizacisn</title>
			</sub-subunit>
		</subunit>
		<subunit>
			<title>experiencia</title>
			<sub-subunit>
				<title>Ejes</title>
			</sub-subunit>
			<sub-subunit>
				<title>Condiciones</title>
				<sub-sub-subunit>
					<title>2.3.2.1.</title>
				</sub-sub-subunit>
				<sub-sub-subunit>
					<title>2.3.2.2.</title>
				</sub-sub-subunit>
				<sub-sub-subunit>
					<title>2.3.2.3.</title>
				</sub-sub-subunit>
			</sub-subunit>
		</subunit>
		<subunit>
			<title>paradigma</title>
			<sub-subunit>
				<title>civilizacisn</title>
			</sub-subunit>
			<sub-subunit>
				<title>emplemsmo</title>
			</sub-subunit>
			<sub-subunit>
				<title>Agenda</title>
			</sub-subunit>
		</subunit>
	</unit>
</modul>

This is my solution:

	<xsl:template match="modul">
		<xsl:copy>
			<xsl:copy-of select="@*"/>
			<xsl:apply-templates/>
		</xsl:copy>
	</xsl:template>

	<xsl:template match="unit">
		<xsl:copy>
			<xsl:copy-of select="@*"/>
			<xsl:call-template name="process-node">
				<xsl:with-param name="node-father" select="name()"/>
			</xsl:call-template>
		</xsl:copy>
	</xsl:template>

	<!-- Copy elements -->
	<xsl:template match="*">
		<xsl:copy>
			<xsl:copy-of select="@*"/>
			<xsl:apply-templates/>
		</xsl:copy>
	</xsl:template>

	<!--
		Test if an element match with the final block using generate-id -->
	<xsl:template name="get-block">
		<xsl:param name="context" select="."/>
		<xsl:param name="target"/>

		<xsl:if test="generate-id($context)!=$target">
			<xsl:apply-templates select="$context" mode="copia"/>
			<xsl:variable name="next-element"
select="$context/following-sibling::*[1]"/>
			<xsl:if test="$next-element">
				<xsl:call-template name="get-block">
					<xsl:with-param name="context" select="$next-element"/>
					<xsl:with-param name="target" select="$target"/>
				</xsl:call-template>
			</xsl:if>
		</xsl:if>

</xsl:template>

	<!--
		Find a subunit tag
	-->
	<xsl:template name="process-node">
		<xsl:param name="context" select="*[1]"/>
		<xsl:param name="node-father"/>

		<xsl:choose>
			<xsl:when test="$context[self::unit or self::subunit or
self::sub-subunit or self::sub-sub-subunit]">
				<xsl:variable name="node-type" select="name($context)"/>
				<xsl:element name="{$node-type}">
					<title><xsl:value-of select="$context"/></title>
					<xsl:call-template name="generate-block">
						<xsl:with-param name="context"
select="$context/following-sibling::*[1]"/>
						<xsl:with-param name="node-type" select="$node-type"/>
					</xsl:call-template>
				</xsl:element>

				<xsl:variable name="seguent-node"
select="$context/following-sibling::*[name()=$node-type][1]"/>

				<xsl:variable name="fathers-name">
					<xsl:call-template name="get-pare">
						<xsl:with-param name="unitat" select="$node-type"/>
					</xsl:call-template>
				</xsl:variable>

				<!-- Test if are the same type and have the same father, for
continuing processing -->
				<xsl:if test="$seguent-node and name($seguent-node)=$node-type and
(generate-id($seguent-node/preceding-sibling::*[name()=$fathers-name][1])=gen
erate-id($context/preceding-sibling::*[name()=$fathers-name][1]))">
					<xsl:call-template name="process-node">
						<xsl:with-param name="context" select="$seguent-node"/>
					</xsl:call-template>
				</xsl:if>

			</xsl:when>
			<xsl:otherwise>
				<xsl:apply-templates select="$context"/>
				<xsl:if test="$context/following-sibling::*">
					<xsl:call-template name="process-node">
						<xsl:with-param name="context"
select="$context/following-sibling::*[1]"/>
					</xsl:call-template>
				</xsl:if>
			</xsl:otherwise>
		</xsl:choose>
	</xsl:template>

	<xsl:template name="generate-block">
		<xsl:param name="context"/>
		<xsl:param name="node-type"/>

		<xsl:if test="$context">
			<!-- Where stops to process? -->
			<xsl:variable name="pares">
				<xsl:call-template name="get-ordre-unitat">
					<xsl:with-param name="unitat" select="$node-type"/>
				</xsl:call-template>
			</xsl:variable>
			<xsl:variable name="node-limit"
select="contains($pares,concat('*',name($context),'*'))"/>

			<xsl:if test="not($node-limit)">
				<xsl:choose>
					<xsl:when test="$context[self::unit or self::subunit or
self::sub-subunit or self::sub-sub-subunit]">
						<xsl:call-template name="process-node">
							<xsl:with-param name="context" select="$context"/>
						</xsl:call-template>
					</xsl:when>
					<xsl:otherwise>
						<xsl:apply-templates select="$context"/>
						<xsl:call-template name="generate-block">
							<xsl:with-param name="context"
select="$context/following-sibling::*[1]"/>
							<xsl:with-param name="node-type" select="$node-type"/>
						</xsl:call-template>
					</xsl:otherwise>
				</xsl:choose>
			</xsl:if>
		</xsl:if>

</xsl:template>

	<!-- Sets the hierarchial order -->
	<xsl:template name="get-ordre-unitat">
		<xsl:param name="unitat"/>

		<xsl:choose>
			<xsl:when test="$unitat='unit'">
				<xsl:value-of select="'*unit*'"/>
			</xsl:when>
			<xsl:when test="$unitat='subunit'">
				<xsl:value-of select="'*unit*subunit*'"/>
			</xsl:when>
			<xsl:when test="$unitat='sub-subunit'">
				<xsl:value-of select="'*unit*subunit*sub-subunit*'"/>
			</xsl:when>
			<xsl:when test="$unitat='sub-sub-subunit'">
				<xsl:value-of select="'*unit*subunit*sub-subunit*sub-sub-subunit*'"/>
			</xsl:when>
		</xsl:choose>

</xsl:template>

	<!-- Retorna pare -->
	<xsl:template name="get-pare">
		<xsl:param name="unitat"/>

		<xsl:choose>
			<xsl:when test="$unitat='unit'">
				<xsl:value-of select="''"/>
			</xsl:when>
			<xsl:when test="$unitat='subunit'">
				<xsl:value-of select="'unit'"/>
			</xsl:when>
			<xsl:when test="$unitat='sub-subunit'">
				<xsl:value-of select="'subunit'"/>
			</xsl:when>
			<xsl:when test="$unitat='sub-sub-subunit'">
				<xsl:value-of select="'sub-subunit'"/>
			</xsl:when>
		</xsl:choose>

</xsl:template>


2006/7/24, Georg Hohmann <georg.hohmann@xxxxxxxxx>:
Dear XSLT-Community,

i have problem with some "strange" type of data which i have to
convert to a hierarchical xml structure.

My source is a huge xml file which represents a decimal
classifikation. It contains so called documents, where each document
represents one node of the classification. Furthermore each documents
shows the direct parents of a node. It's a structure like this
(example taken from http://www.udcc.org):
...
<document>
       <tag1>3</tag1>
       <tag1a>Social Sciences</tag1a>
</document>
<document>
       <tag1>3</tag1>
       <tag1a>Social Sciences</tag1a>
       <tag2>32</tag2>
       <tag2a>Politics</tag2a>
</document>
<document>
       <tag1>3</tag1>
       <tag1a>Social Sciences</tag1a>
       <tag2>32</tag2>
       <tag2a>Politics</tag2a>
       <tag3>326</tag3>
       <tag3a>Slavery</tag3a>
</document>
...
As you can see there is no hierarchical information in it instead of
the names and the sequence of the tags. In my real data i have up to 9
levels, but not every time. My result should look like this (or
something similar):
...
<node id="3" name="Social Science">
  <node id="32" name="Politics">
     <node id="326" name="Slavery"/>
  </node>
</node>
...
I have simply no idea what to start with to archive this result. I
guess the first step would be to get rid of all those redundant
content, but i don't know how. And i even can't figure out how to
build the hierachichal structure the same time.

Has anyone a good starting point for this?

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.