# Wednesday, 13 June 2012

Life Cycle of a Purchase Order: EDI X12, XML and PDF

Most of the business transactions B to C (Business to Consumer) today are done by credit card over the web. On the B to B (Business to Business) the volume and the velocity require a different approach, where computer systems do most of the work. These systems are called supply chain networks and their job is to connect buyers and sellers electronically to exchange goods of any kind.

Supply chain networks have existed for many years but were difficult and costly to integrate, which restricted their market to the elite. Their inability to accept medium and small businesses forced a large portion of the supply chain networks out of business.

The survivors learned the lesson and started to offer a scaled down version of their services, increasing the size of their customer base. The larger the network, the greater the chance of success, especially in growing markets such as China and India.

Computer hardware evolves at a fast pace, as Moore’s law states: every two years CPU makers double the number of transistors in a silicon die, improving the lithography process. While IT organizations tend to upgrade their servers every 3 years, data architects are slow to move from time tested successful standards.  These standards provide the back bone of systems’ interoperability.

One of greatest examples of successful standards for data interchange format is Electronic Data Interchange (EDI), created by the National Institute of Standards and Technology (NIST) in 1996.  EDI’s goals were simple: a text based format with a well-defined structure, which allows two entities to exchange information.

The beginning of the formal definition reads: <<”the computer-to-computer interchange of strictly formatted messages that represent documents other than monetary instruments.”>>.

EDI was an instant hit. A variety of standard bodies specialized in industry verticals creating their own dialects. Here are a few examples: 

·         Accredited Standards Committee X12

·         United Nations/Electronic Data Interchange For Administration, Commerce and Transport (UN/EDIFACT)

·         Health Insurance Portability and Accountability Act of 1996 (HIPAA).

EDI is the legacy format that supply chain systems still use today.  Industry specific EDI dialects perform millions of transactions per day.  EDI standards boards have also come out with an XML version of the dialects but the legacy format is still the de-facto standard.

Now in order to participate in these  markets, you have to understand how to interpret EDI transactions, and how to respond to a partner.  Walmart uses EDI to allow it to scale its business, requiring all suppliers to interact with them using EDI to send documents such as Purchase Orders and Invoices. http://www.walmartstores.com/Suppliers/248.aspx

Before we start, let’s refresh our memory on  the definition of a Purchase Order. Wikipedia states:
<<”A purchase order (PO) is a commercial document issued by a buyer to a seller, indicating types, quantities, and agreed prices for products or services the seller will provide to the buyer.”>>

In the following image we see a Purchase Order formatted according to EDI X12 850 in its raw form.

Company “Office Supply” wants to purchase digital pens and appropriate paper blocks from company “Smart Pen” and deliver the products to company “OfficeY”.

Unfortunately, this is not like XML.  You may guess that fields are separated by a special character but there is no markup around them, therefore is very difficult to interpret the values. In the following image we see the same EDI transaction in the Stylus Studio EDI to XML Module.  This is a very useful tool for analyzing EDI documents. We can click anywhere and Stylus Studio shows us which field we are on, thanks to its vast EDI repository. When we right click on a value that represents a code, we can see the description.

Now that the purchase order is in XML we can take advantage of several technologies to manipulate and to store the data.

The EDI to XML module can be used to validate both the structure, the code list values and in case of errors, provide suggestions to work around the problem. For example, it’s quite common to encounter a transaction in which some mandatory fields are missing or the field type does not match the specification. In these situations Stylus Studio flags the error and allows the user to overwrite the field definition to accommodate such customizations.

In Stylus Studio, almost any format can be represented as XML. In the screenshot below the XML Editor is used to edit our purchase order.

The XML representation can be customized with a variety of parameters. You may decide to generate a very verbose form with long element names. You may also use informative comments that provide a detailed description for each field. When looking at code values in the raw form, they are usually very cryptic terms made of a few characters, which can now be associated with detailed descriptions.  The example below shows you that the code value BP stands for “Paid by Buyer”.

 

When an EDI transaction comes in, the first thing we want to do is to archive it in a database. Often there are regulations that require electronic copies to be maintained for several years and storing the transaction in a relation database is the most common place where it can be archived. The following screenshot shows how the Stylus Studio XQuery mapping tool can help you to onboard the transaction into a database. In our example we are using Microsoft SQL Server 2008.

We would like to point out a benefit of using the DataDirect XQuery engine, bundled with Stylus Studio. This combination allows you to store primitive types such as the order id and the order date as well as the entire input document into an XML type column.

 

After the XQuery execution we can see a new record has been created in the table.

EDI and XML share similar benefits. They are great for making computer systems communicate, but human beings need some help to make this information easy to interpret. The following screenshot shows how Stylus Studio XML Publisher can be used to produce an appealing PDF document that can be used to display on the screen and on printed paper.

 

We hope you enjoyed reading this article. If you have any questions, do not hesitate to contact us.

You can download the Project Zip file by clicking here.

- Stylus Studio Team

 Technical Support
 Follow us on Twitter
 Connect on Facebook  

posted on Wednesday, 13 June 2012 14:58:54 (Eastern Daylight Time, UTC-04:00)  #    Comments [0] Trackback
# Tuesday, 27 March 2012

Introduction to XSLT 3.0

While many W3C specifications take years to reach the recommendation state, XSLT has evolved quickly and deterministically, thanks not in small part to the great talent and sobriety of its spec. chair and a dedicated board committee.

The Stylus Studio team decided to be on the cutting edge, introducing support for the current XSLT 3.0 working draft in version X14 in order to give a chance to the community to start developing using the new language edition.

A variety of exciting new features have been introduced to make the language modern and to allow implementers to take advantage of modern hardware for transforming large data sets.

Support for Streaming

The need to process XML in streaming fashion, in other words, without loading the entire input document in memory, has risen over the years.  Several use cases require processing very large streams of XML events, for example stocking tickers or social media user's stream.

Here I show the specification formally defines streaming:

<<" A processor that claims conformance with the streaming option offers a guarantee that  ... an algorithm will be adopted ... allowing documents to be processed that are orders-of-magnitude larger than the physical memory available.">>

In 2007, a team of XML experts came up with a dedicated language called STX, Streaming Transformation for XML, to tackle the problem.  Even if the language did not gain significant popularity, it was a valuable exercise to identify use cases and come up with a declarative approach. Such experience has been an important inspiration for introducing the streaming feature in XSLT 3.0.

XSLT 3.0 introduces new constructs (xsl:stream, xsl:mode streamable="yes") to explicitly indicate to stream the execution of its instruction body.  Under streaming mode, there are a number of restrictions to be aware of:

·         You have access only to the current element attributes and namespace declaration.

·         Sibling nodes and ancestor sibling are not reachable.

·         You can visit child nodes only once.


 

The following diagram illustrates which nodes are accessible while processing an xml document that contains a list of books.


Here is an example of how to split a very large document into small fragments:

<?xml version="1.0"?>
<xsl:stylesheet version="3.0"
                xmlns:xsl=
"http://www.w3.org/1999/XSL/Transform">

    <xsl:template match="/">
        <xsl:stream href="books.xml">
            <xsl:iterate select="/books/book">
                <xsl:result-document href="{concat('book', position(),'.xml')}">
                    <xsl:copy-of select="."/>
                </xsl:result-document>
                <xsl:next-iteration/>
            </xsl:iterate>
        </xsl:stream>
    </xsl:template>

</xsl:stylesheet>

 

Also of interest is the new instruction xsl:fork which declares that an XSLT block can be executed independently, during a single pass of a streamed input document.

Unfortunately, Saxon does not implement declarative streaming at the time of this writing.


 

Higher-Order Functions

Higher order functions are functions that either take functions as parameters or return a function.

XPath 3.0 introduces the ability to define anonymous functions and the XDM has been extended with the function item type. Such changes open the door to meta-programming using lambda expressions.

Let us start with an example: here is a lambda expression that calculates the square of two numbers and sums them.
(x, y) x*x + y*

Such expressions can be can be reworked into an equivalent function that accepts a single input, and as output returns another function, that in turn accepts a single input .
x (y x*x + y*y)

The variable f1 is assigned to an anonymous function that takes an integer and returns a function that takes an integer and returns an integer.

<?xml version='1.0'?>
<xsl:stylesheet
    
version="3.0"
    xmlns:xsl=
"http://www.w3.org/1999/XSL/Transform"
    xmlns:xs=
"http://www.w3.org/2001/XMLSchema">
<xsl:template match="/">
        <xsl:variable name="f1" select="
            function($x as xs:integer) as (function(xs:integer) as xs:integer){

                    function ($y as xs:integer) as xs:integer{
                        $x*$x + $y * $y
                    }

            }
        "
/>
        <xsl:value-of select="$f1(2)(3)"/>
</xsl:template>
</xsl:stylesheet>

 

XPath 3.0 provides built-in support for common lambda patterns  such as map, filter, fold-left, fold-right, map-pairs. Here is an example of folding that sums only positive numbers from a list:

<?xml version="1.0"?>
<xsl:stylesheet  version="3.0"xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <xsl:variable name="list" select="(10,-20,30,-40)"/>

    <xsl:template match="/">
        <xsl:variable name="f1" select="
        function($accumulator as item()*, $nextItem as item()) as item()*
        {
            if($nextItem &gt; 0) then
                $accumulator + $nextItem
            else
                $accumulator
        }"
/>

        <xsl:value-of select="fold-left($f1, 0, $list)"/>
    </xsl:template>
</xsl:stylesheet>

Text Manipulations

The language designers had always considered text manipulation an important feature, starting from XSLT 1. Functions for formatting numbers, date and time played an important role in building html content and eventually were moved to XPath in order to be shared with XQuery. XPath 2.0 introduced a large number of functions for manipulating strings: tokenize, matches, replace, string-join, upper-case, and lower-case. 

 Version 3 introduces a variety of new built-in functions for manipulating text, which are very useful when dealing with CSV data such as unparsed-text-lines, unparsed-text-available.

The following example shows how to implement a simple CSV to XML converter:

<?xml version="1.0"?>
<xsl:stylesheet version="3.0"
     xmlns:xsl=
"http://www.w3.org/1999/XSL/Transform"
        xmlns:xs=
"http://www.w3.org/2001/XMLSchema"
        xmlns:hd=
"urn:header">

    <xsl:param name="csv" select="'one.csv'"/>
    <xsl:param name="sep" select="','"/>
    <xsl:param name="rootElement" select="'root'"/>
    <xsl:param name="rowElement" select="'row'"/>
    <xsl:param name="firstRow" select="true()"/>

    <xsl:variable name="header" select="tokenize(unparsed-text-lines($csv)[1], $sep)"/>

    <xsl:function name="hd:header" as="xs:string">
        <xsl:param name="col"/>
        <xsl:choose>
            <xsl:when test="$firstRow">
                <xsl:value-of select="$header[$col]"/>
            </xsl:when>
            <xsl:otherwise>item</xsl:otherwise>
    </xsl:choose>
    </xsl:function>

    <xsl:template match="/">
        <xsl:element name="{$rootElement}">
            <xsl:for-each select="unparsed-text-lines($csv)[position() &gt; 1]">
                <xsl:element name="{$rowElement}">
                    <xsl:for-each select="tokenize(., $sep)">
                        <xsl:variable name="pos" select="position()"/>
                        <xsl:element name="{hd:header($pos)}">
                            <xsl:value-of select="."/>
                        </xsl:element>
                    </xsl:for-each>
                </xsl:element>
            </xsl:for-each>
        </xsl:element>
    </xsl:template>
</xsl:stylesheet>

 

When processing in input a file like the following:

make,model,year,mileage
BMW,R1150RS,2004,14274
Kawasaki,GPz1100,1996,60234
Ducati,ST2,1997,24000
Moto Guzzi,LeMans,2001,12393
BMW,R1150R,2002,17439
Ducati,Monster,2000,15682
Aprilia,Futura,2001,17320

 

Produces as output

<?xml version='1.0' ?>
<root>
  <row>
    <make>BMW</make>
    <model>R1150RS</model>
    <year>2004</year>
    <mileage>14274</mileage>
  </row>

...
</root>

 

Conclusions

As you can see, there are many changes to look forward to in the upcoming XSLT 3.0 version. The specification is still under discussion and has not been finalized.  The Stylus Studio Team will follow this closely and will release intermediate builds to provide a reference implementation in order to be prepared when version 3.0 goes live.
 
 
 
 

posted on Tuesday, 27 March 2012 13:37:06 (Eastern Standard Time, UTC-05:00)  #    Comments [0] Trackback