Transforming XML to Fixed-Width

Assume we have the following XML document which represents a collection of used bikes, our requirement is to import it into a legacy mainframe order system which accepts exclusively a fixed-with format.

<lotdate="2016-05-11T00:00:00"lot_number="1234">     
    <bike>
        <make>BMW</make>
        <model>R1150RS</model>
        <year>2004</year>
        <mileage>14274</mileage>
    </bike>
    <bike>
        <make>Kawasaki</make>
        <model>GPz1100</model>
        <year>1996</year>
        <mileage>60234</mileage>
    </bike>
</lot>

 

The fixed-width format layout is made of 3 record types: header, bike and footer. Each record contains field which must be the indicated size, when a filed value is shorter, it has to be padded with spaces.

For example, field “Lot” must be 10 characters long, in our example the lot number is 1234 therefore it needs to be padded with 6 spaces

 

HEADER Record, mandatory,occurs 1.

Name

Value

Offset

Size

Type

H

0

1

Date

 

1

8

Lot

 

9

10

 

BIKE Record, mandatory,occurs > 0.

Name

Value

Offset

Size

Type

B

0

1

Make

 

1

10

Model

 

11

10

Year

 

21

4

Mileage

 

24

7

 

FOOTER Record, mandatory, occurs 1.

Name

Value

Offset

Size

Type

F

0

1

count

 

1

3

 

The XML input sample needs to be transformed to the following fixed width representation

H201605111234
BBMW       R1150RS200414274
BKawasaki GPz 1100199660234
F2 

XQuery only solution

In our first approach we will use Stylus Studio to create an XQuery program to accomplish our task.

Our first task is to create a user defined function which pads a given value to a specific length

module namespace u = "urn:dsasoft:xquery:util" ;

 

declare function u:field($fieldName as xs:string, $value, $offset, $size as xs:integer, $pad) as xs:string
{
     let $value_len := string-length($value)
     return
     if($value_len > $size) then
      fn:error(fn:QName( "urn:dsasoft:errors" ,"error"), concat("value:", $value," too big to fit in ", $fieldName), ())
     else
            let $blanks := string-join(for $i in 1 to $size - $value_len return " ", "")
            let $newValue :=
if ($pad = 'right') then concat($value, $blanks)
else concat($blanks,$value)
            let $off_set_str := "" (:concat('[',$offset,']'):)
            let $len := "" (:concat("&#10;&#10;", '[',$offset + string-length($newValue),'- ',$fieldName,']', "&#10;"):)
            return concat($off_set_str,$newValue, $len)
};

We also create a convenient function for converting date from ISO format to YYYYMMDD

declare function u:format_CMYYMMDD_date($date as xs:dateTime)
{
    ddtek:format-dateTime($date, "[Y0001][M01][D01]")
};

 

 

In our main XQuery program, we start creating each record, assembling each field using the function “field” to pad the values

import module namespace u = "urn:dsasoft:xquery:util" at "toFixedWidth_lib.xquery" ;
declare variable $pad := "right";
declare variable $EOL := "&#10;";

declare option ddtek:serialize "method=text";



let $HEADER :=(
                 u:field("type", "H" , 0 , 1, $pad)
                ,u:field("date", u:format_CMYYMMDD_date(/lot/@date) , 1 , 8, $pad)
                ,u:field("lot" , /lot/@lot_number , 9 , 10, $pad)
                ,$EOL
)
let $BIKES :=
        for $bike in /lot/bike
        return (
                         u:field("type" , "B" , 0 , 1, $pad)
                        ,u:field("make" , $bike/make , 1 , 10, $pad)
                        ,u:field("model" , $bike/model , 11 , 10, $pad)
                        ,u:field("year" , $bike/year , 21 , 4, $pad)
                        ,u:field( "milage" , $bike/mileage, 25 , 7, $pad)
                        ,$EOL
            )

let $FOOTER := (
                 u:field("type", "F" , 0 , 1, $pad)
                ,u:field("count", string(count(/lot/bike)), 1 , 3, $pad)
                ,$EOL
)

return string-join( ($HEADER, $BIKES, $FOOTER), "")

 

The above solution requires XQuery understanding and may not scale on very large file because the all content needs to be merge in one string in memory


 

 

Using Generic Converter

Starting with version X16, Stylus Studio features a new general purpose bi-directional converter which provides great support for fixed-width formats.

We start creating a new EDI to XML Conversion

 

 

We want to create conversion with Dialect = GENERIC

 

We right click on “Transaction Messages” to create a new Message

 

 

 

 

We right click on “Segments” to create a new segment “HEADER”.  Segments represent records in our file format, we will create a segment for each record type: HEADER, BIKE and FOOTER

 

 

 

After the segment is created we want to set the parse mode to Fixed-width and the regular expression to match any line that starts with H,

 

 

 

We right click on “Elements” to create a new field. Elements represents columns in our records

 

For column TYPE we set minimum and maximum size to 1 and type to “simple string”, we also set the description which will be used as postfix in the XML tag name, more details later.

 

Here the screenshots for adding all HEADER fields

 

 

Now we add the field references to the segment

 

 

 

 

Now we add to the segment reference to the message

 

 

We then continue to add segments and fields to complete the layout

 

 

 

If you have sample of the fixed-width file you need to generate you can test it. Click on the scenario button then browse for your sample file

 

 

Notice that the data is rendered in red which means that the converter is not recognizing the format

 

 

Open the converter URL properties and then set the following properties

dialect = GENERIC
long = elements
message = BIKES
ms = HEADER
version = FLATFILE

 

 

Now the data is rendered in green and if you click on the preview button you can see how the converter turns fixed-with into XML.

 

Now let’s create a schema which will be the target of our mapping, right click on message BIKES

 

 

 

At this point we save our definition as bikes.sef.

Now we create a new XQuery, switch to the “mapper” tab, set as source our XML input file and as target schema bike_BIKES.xsd which we have generated in the previous step

 

We start mapping the field from left to right, in addition we need to set field Type to “H” for HEADER, Type to “B” for BIKE and Type to “F” for FOOTER

 

 

 

 

 

 

Here the full mapping completed

 


 

But our mapping still generates XML in output and we need fixed-width so now we open the scenario dialog and we set the output URL

 

Now we browse for the Output URL, imput the filename we want to output, click on “Save using XML Converter”

 

On the Select Converter dialog we select EDI on the left inside. On the property table we scroll down until we get to the “user” property and we browse for the bikes.sef  file

 

 

 

 

 

 

 

Now every time we run Preview Result, Stylus Studio generates a fixed-width file output.txt

 

If we want to use our mapping in a pipeline the topography will look like this

 

PURCHASE STYLUS STUDIO ONLINE TODAY!!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Try Stylus Powerful XQuery IDE

Download a free trial of our award-winning IDE for XQuery today!

Learn Stylus Studio in 6 Minutes

Can six minutes change your life? A new action-packed online video demonstration covers how to use Stylus Studio in just six minutes! Watch this video and change the way you develop XML applications forever!

Why Pay More for XML Tools?

With Stylus Studio® X16 XML Enterprise Suite, you get the most comprehensive XML tool suite at one incredibly low price. Value: it's just one of many reasons why smart XML developers are choosing Stylus Studio!

Top Ten XQuery Trends

Read about the top 10 XQuery Trends and how they will impact change the way enterprise software applications are built.

 
Free Stylus Studio XML Training:
W3C Member