[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: An efficient, safe, extensible XML data design ...mimickin

  • From: Thomas Passin <list1@tompassin.net>
  • To: xml-dev@lists.xml.org
  • Date: Thu, 26 Mar 2015 18:47:57 -0400

Re:  An efficient
On 3/26/2015 4:12 PM, Costello, Roger L. wrote:
Consider this scenario: you have installed a device that monitors the
data that flows through a router. With that device you record
information about the flow, e.g.,

<Flow-data> <Number-of-bytes>500</Number-of-bytes>
   <Source-IPv4-address>129.87.74.0</Source-IPv4-address>
   <Destination-IPv4-address>129.87.75.0</Destination-IPv4-address>
</Flow-data>
... [more complex examples snipped]

I think this is an awful misuse of XML. You're creating an extremely
verbose format that will be prone to parsing errors if anything goes
wrong. The data element values and codes have to be kept in their
correct sequence, so that if any code value or data value accidentally
gets omitted (could happen), all subsequent values will be corrupted.

The format is extremely verbose for the kinds of data values it conveys.

You say "It uses integer codes (very efficient) rather than string
element descriptors (very inefficient ..." but the length of the code
values is trivial compared with the number of characters for the element
start and stop tags, so that's pretty well irrelevant. The rest of the format is very inefficient, at least so far as the ratio of data to boilerplate is concerned.

"Flow data" sounds like it should be able to stream as long as one wants, but this format won't be able to do that.

Seems to me that this data would do better using JSON. Or if there is a
*lot* of data, some binary encoding of ASN.1, perhaps. And it takes a
lot to get an old XML hand like me to say something like that!

Sticking with XML anyway, the way you are showing
<data><value>.....<value><value>....</data>, you might as well omit the
<value> element and just use <data> for each value. That would simplify
it a little. Of course, if you need to specify the units for the
values, you're back to needing nested <value> elements (or else you
could use attributes, as in <data unit='m'>).

It would be helpful if you explained what "flow data" means in this context, what data sizes you expect, if it needs to be parsed as a long indeterminate stream, how many data elements there will likely be ... In other words, some basic practical requirements. After all, if this is a one-off for parsing 100 elements, this or most anything will do. If a message will contain 100,000,000 data values month after month, it's a whole different thing.

TomP


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.