[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

An efficient, safe, extensible XML data design ... mimicking in XMLa bin

  • From: "Costello, Roger L." <costello@mitre.org>
  • To: "xml-dev@lists.xml.org" <xml-dev@lists.xml.org>
  • Date: Thu, 26 Mar 2015 20:12:47 +0000

An efficient

Hi Folks,

Consider this scenario: you have installed a device that monitors the data that flows through a router. With that device you record information about the flow, e.g.,

<Flow-data>
   
<Number-of-bytes>500</Number-of-bytes>
   
<Source-IPv4-address>129.87.74.0</Source-IPv4-address>
   
<Destination-IPv4-address>129.87.75.0</Destination-IPv4-address>
</Flow-data>

Suppose there are many like-minded people who collect flow data: they all want to monitor and record the number of bytes, the source IP address, the destination IP address, and so forth. So someone (or a group of people) create a registry of standard flow information elements:

Information Element

Meaning

1

Number of bytes

2

Source IPv4 address

3

Destination IPv4 address

. . .

 

Given that registry, we can create a different XML design, one where consumers are expected to make use of the registry to process/understand the XML:

<Flow-data>
   
<Information-element>
       
<Code>1</Code>
       
<Value>500</Value>
   
</Information-element>
   
<Information-element>
       
<Code>2</Code>
        
<Value>129.87.74.0</Value>
   
</Information-element>
   
<Information-element>
       
<Code>3</Code>
       
<Value>129.87.75.0</Value>
   
</Information-element>
</Flow-data>

The potential drawback of that design is that it intermingles the data (500129.87.74.0129.87.75.0) with the codes (12 3). So let's create an XML design which separates the semantic description of the data (i.e., the codes) from the data, e.g.,

<Flow-data>
   
<Data-codes>
       
<Code>1</Code>
       
<Code>2</Code>
       
<Code>3</Code>
   
</Data-codes>
   
<Data>
       
<Value>500</Value>
       
<Value>129.87.74.0</Value>
       
<Value>129.87.75.0</Value>
   
</Data>
</Flow-data>

This new design assumes every user will want to collect exclusively the standard flow information elements (number of bytes, source IP address, destination IP address, etc.). But some users (companies) might want to collect non-standard, proprietary flow information elements. So let's add to the XML an indication of whether a code is a standard code or a non-standard, proprietary code:

<Flow-data>
    <Data-codes>
        <Code-specifier>
            <Standard-flow-data-item>true</Standard-flow-data-item>
           
<Code>1</Code>
        </Code-specifier>
        <Code-specifier>

           
<Standard-flow-data-item>true</Standard-flow-data-item>
           
<Code>2</Code>
        </Code-specifier>
        <Code-specifier>

           
<Standard-flow-data-item>true</Standard-flow-data-item>
            <Code>3</Code>
        </Code-specifier>
    </Data-codes>
    <Data>
        <Value>500</Value>
        <Value>129.87.74.0</Value>
        <Value>129.87.75.0</Value>
    </Data>
</Flow-data>

That instance document doesn't show the use of a non-standard flow data item, so let's add one more data item: A company records the non-standard XYZ flow data item (code=15); the metering process measures 23 such items:

<Flow-data>
    <Data-codes>
        <Code-specifier>
            <Standard-flow-data-item>true</Standard-flow-data-item>
            <Code>1</Code>
        </Code-specifier>

        <Code-specifier>
            <Standard-flow-data-item>true</Standard-flow-data-item>
            <Code>2</Code>
        </Code-specifier>
        <Code-specifier>
            <Standard-flow-data-item>true</Standard-flow-data-item>
            <Code>3</Code>
        </Code-specifier>
        <Code-specifier>

           
<Standard-flow-data-item>false</Standard-flow-data-item>
           
<Code>15</Code>
       
</Code-specifier>
    </Data-codes>
    <Data>
        <Value>500</Value>
        <Value>129.87.74.0</Value>
        <Value>129.87.75.0</Value>

       
<Value>23</Value>
    </Data>
</Flow-data>

Next, consumers want to know, "Hey, what authority (company/enterprise) does that non-standard code come from?" So we need to include an indication of the authority defining the flow information element:

<Flow-data>
    <Data-codes>
        <Code-specifier>
            <Standard-flow-data-item>true</Standard-flow-data-item>
            <Code>1</Code>
        </Code-specifier>
        <Code-specifier>
            <Standard-flow-data-item>true</Standard-flow-data-item>
            <Code>2</Code>
        </Code-specifier>
        <Code-specifier>
            <Standard-flow-data-item>true</Standard-flow-data-item>
            <Code>3</Code>
        </Code-specifier>
        <Code-specifier>
            <Standard-flow-data-item>false</Standard-flow-data-item>
            <Code>15</Code>
            <Code-of-the-enterprise-that-defined-the-code>115</Code-of-the-enterprise-that-defined-the-code>
       
</Code-specifier>
    </Data-codes>
    <Data>
        <Value>500</Value>
        <Value>129.87.74.0</Value>
        <Value>129.87.75.0</Value>
        <Value>23</Value>
    </Data>
</Flow-data>

TaDa!

I like that XML design. I think it is very powerful. It uses integer codes (very efficient) rather than string element descriptors (very inefficient and prone to security breaches). It supports extensibility; users are not locked into a limited, rigid set of flow information elements.

Recall that I talked about a "registry of standard data flow information elements." There actually is such a registry: http://www.iana.org/assignments/ipfix/

Also, recall that I talked about (or at least hinted at) a "registry of enterprise identifiers." There actually is such a registry: http://www.iana.org/assignments/enterprise-numbers/

What I have just walked you through is the design approach used by:

IPFIX (IP Flow Information Export): a, IETF standard data format
for representing flow data (http://tools.ietf.org/html/rfc7011)

IPFIX is a binary data format. I have walked you through the equivalent design in XML.

Pretty cool, aye?

/Roger



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.