[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Tool converts records to XML

  • From: Michael Kay <mike@saxonica.com>
  • To: Roger L Costello <costello@mitre.org>
  • Date: Tue, 15 Nov 2022 00:57:08 +0000

Re:  Tool converts records to XML
You've been at this game long enough, Roger, to have seen the "Barnes & Noble" problem. The number #1 blunder when writing XML is not to bother escaping `<` and `&` if they happen to occur in your input.

Michael Kay
Saxonica

> On 14 Nov 2022, at 23:09, Roger L Costello <costello@mitre.org> wrote:
> 
> Hi Folks,
> 
> In the spirit of UNIX tool building .....
> 
> I created a simple tool that converts records of tab-delimited data into XML. For example, these records:
> 
> title	authors	date	isbn	publisher
> Unix Shell Programming	Stephen G. Kochan, Patrick Wood	2019	0-872-32400-3	SAMS
> Small, Sharp Software Tools	Brian P. Hogan	2019	978-1-68050-296-1	The Pragmatic Programmers
> The AWK Programming Language	Alfred V. Aho, Brian W. Kernighan, Peter J. Weinberger	1988	0-201-07981-X	Addison-Wesley Publishing Company
> 
> are converted to this XML:
> 
> <document>
> 	<row>
> 		<title>Unix Shell Programming</title>
> 		<authors>Stephen G. Kochan, Patrick Wood</authors>
> 		<date>2019</date>
> 		<isbn>0-872-32400-3</isbn>
> 		<publisher>SAMS</publisher>
> 	</row>
> 	<row>
> 		<title>Small, Sharp Software Tools</title>
> 		<authors>Brian P. Hogan</authors>
> 		<date>2019</date>
> 		<isbn>978-1-68050-296-1</isbn>
> 		<publisher>The Pragmatic Programmers</publisher>
> 	</row>
> 	<row>
> 		<title>The AWK Programming Language</title>
> 		<authors>Alfred V. Aho, Brian W. Kernighan, Peter J. Weinberger</authors>
> 		<date>1988</date>
> 		<isbn>0-201-07981-X</isbn>
> 		<publisher>Addison-Wesley Publishing Company</publisher>
> 	</row>
> </document>
> 
> Each record is wrapped in a <row>...</row> element. The fields in each record are wrapped in an element named by the header. The root element is <document>...</document.
> 
> The tool may be invoked with a file, like this:
> 
> toxml books.txt
> 
> or from standard input, like this:
> 
> cat books.txt | toxml
> 
> The tool is a small AWK program, which I named "toxml":
> ---------------------------------------------------------
> awk '
> BEGIN  	{   # field separator is tab (\t)
>          	    # record separator is LF (\n)
>          	    OFS=FS="\t"
>          	    RS="\n"
>          	    print "<document>" 
>        	}
> NR==1  	{  # store column header names in an array
>           	    for (i=1; i<=NF; i++)
>              	        header[i]=$i;
>        	}
> NR!=1  	{   # create a <row>...</row> element for the line 
>           	    # surround field $i with a start/end tag named header[i]
>           	    print "<row>"
>           	    for (i=1; i<=NF; i++)
>              	        print "<" header[i] ">" $i "</" header[i] ">"
>           	    print "</row>"
>        	}
> END    	{ print "</document>" }' $*
> ---------------------------------------------------------
> 
> 
> 
> 
> 
> _______________________________________________________________________
> 
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
> 
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> subscribe: xml-dev-subscribe@lists.xml.org
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
> 



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.