[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: XSLT versus AWK

  • From: Michael Kay <mike@saxonica.com>
  • To: Roger L Costello <costello@mitre.org>
  • Date: Sun, 31 Jul 2022 18:14:02 +0100

Re:  XSLT versus AWK
The problem here is you're dealing with two data formats, field-separated text and XML, and AWK only handles one of them. You're therefore resorting to creating XML "by hand", and like most people who create XML by hand, you're getting it wrong - you're not bothering to escape special characters like `<` and `&`, and you're not bothering to check that the field names are valid XML names. Mistakes like that account for most of the bad XML that we all have to deal with, and that's why most of us recommend using XML-aware tools not just to parse XML, but to create it as well.

Lesson learned: use the right programming language for the right data format.

Michael Kay
Saxonica

On 31 Jul 2022, at 12:22, Roger L Costello <costello@mitre.org> wrote:

Hi Folks,

XSLT is a programming language specifically designed for processing textual data that is formatted as XML.

AWK is a programming language specifically designed for processing textual data that is formatted as records containing fields. Interestingly, I have observed that the records/fields format is the one used for input and output by most UNIX tools.

XSLT and AWK are mature programming languages. XSLT was created roughly 24 years ago at the W3C. AWK was created roughly 45 years ago at Bell Labs by Alfred Aho, Peter Weinberger, and Brian Kernighan (the name AWK comes from their last names).

XSLT and AWK substantially reduce -- relative to other programming languages -- the amount of code, time, and effort needed to process their respective data formats. A developer will be far more productive writing an XSLT program to process XML-formatted data than if he were to write the program in some other programming language. A developer will be far more productive writing an AWK program to process records-and-fields-formatted data than if he were to write the program in some other programming language. 

XSLT and AWK are complimentary. An XSLT program can convert an XML document into a document containing records and fields. An AWK program can convert a document consisting of records and fields into an XML document. In fact, just yesterday I did that very thing -- I wrote an AWK program to convert to XML a huge document containing records with tab-delimited fields, where the first record contained column headers. See my simple, short AWK program below (note: I am an AWK newbie, so there are likely better ways to write the program).

Lesson Learned: Use the right programming language for the right data format.

/Roger

convert2xml.awk

BEGIN   {  # field separator is tab (x09)
                 # record separator is CRLF (\r\n) 
                  FS = "\t"
                  RS = "\r\n"
                 print "<Airport>" 
               }
NR==1   {  # get column header names, store in an array
                  for (i=1; i<=NF; i++)
                     header[i] = $i; 
               }
NR!=1   {  # create a <Row>...</Row> element for the line 
                  # surround field $i with a start/end tag named header[i] 
                  print "<Row>"
                  for (i=1; i<=NF; i++)
                     print "<" header[i] ">" $i "</" header[i] ">"
                  print "</Row>"
               }
END       { print "</Airport>" }




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.