Hi Folks,
In the spirit of UNIX tool building .....
I created a simple tool that converts records of tab-delimited data into XML. For example, these records:
title
authors
date
isbn
publisher
Unix Shell Programming
Stephen G. Kochan, Patrick Wood
2019
0-872-32400-3
SAMS
Small, Sharp Software Tools
Brian P. Hogan
2019
978-1-68050-296-1
The Pragmatic Programmers
The AWK Programming Language
Alfred V. Aho, Brian W. Kernighan, Peter J. Weinberger
1988
0-201-07981-X
Addison-Wesley Publishing Company
are converted to this XML:
<document>
<row>
<title>Unix Shell Programming</title>
<authors>Stephen G. Kochan, Patrick Wood</authors>
<date>2019</date>
<isbn>0-872-32400-3</isbn>
<publisher>SAMS</publisher>
</row>
<row>
<title>Small, Sharp Software Tools</title>
<authors>Brian P. Hogan</authors>
<date>2019</date>
<isbn>978-1-68050-296-1</isbn>
<publisher>The Pragmatic Programmers</publisher>
</row>
<row>
<title>The AWK Programming Language</title>
<authors>Alfred V. Aho, Brian W. Kernighan, Peter J. Weinberger</authors>
<date>1988</date>
<isbn>0-201-07981-X</isbn>
<publisher>Addison-Wesley Publishing Company</publisher>
</row>
</document>
Each record is wrapped in a <row>...</row> element. The fields in each record are wrapped in an element named by the header. The root element is <document>...</document.
The tool may be invoked with a file, like this:
toxml books.txt
or from standard input, like this:
cat books.txt | toxml
The tool is a small AWK program, which I named "toxml":
---------------------------------------------------------
awk '
BEGIN
{ # field separator is tab (\t)
# record separator is LF (\n)
OFS=FS="\t"
RS="\n"
print "<document>"
}
NR==1
{ # store column header names in an array
for (i=1; i<=NF; i++)
header[i]=$i;
}
NR!=1
{ # create a <row>...</row> element for the line
# surround field $i with a start/end tag named header[i]
print "<row>"
for (i=1; i<=NF; i++)
print "<" header[i] ">" $i "</" header[i] ">"
print "</row>"
}
END
{ print "</document>" }' $*
---------------------------------------------------------
_______________________________________________________________________
XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.
[Un]Subscribe/change address:
http://www.oasis-open.org/mlmanage/Or unsubscribe:
xml-dev-unsubscribe@l...subscribe:
xml-dev-subscribe@l...List archive:
http://lists.xml.org/archives/xml-dev/List Guidelines:
http://www.oasis-open.org/maillists/guidelines.php