[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Protocol Buffers - Why not use XML

  • From: Hans-Juergen Rennau <hrennau@yahoo.de>
  • To: Arjun Ray <arjun.ray@verizon.net>, "xml-dev@l..." <xml-dev@l...>
  • Date: Sat, 13 Feb 2016 23:58:35 +0000 (UTC)

Re:  Protocol Buffers - Why not use XML
A few things should be noted.

The data format of .proto files is not the format used to exchange/persist data, but the source format from which the (binary) format used for exchange and storage is generated. Of course this is an arbitrary choice which has nothing whatsoever to do with the actual advantages of using the binary format. Of course one could easily define an XML format equivalent to the .proto format and compile the XML format, rather than .proto, into the binary format. Hence the comparison between XML and .proto (in its compiled form) is inadequate, because it is a comparison between some transformation output and an alternative transformation input. It's like comparing a cupboard with elem trees, emphasizing practical advantages of cupboards over elm trees.

We are informed that currently there are 48.162 message types defined across 12,183 .proto files. How to manage and monitor a data model repositor of that size? If the .proto files were in fact XML files, all kinds of analysis and consistency control would be very easy. For example, assuming the 12.183 .proto files were in fact XML files and located in a directory tree rooted in a directory /proto, a glossary of all .proto-defined item names could be obtained by the following XQuery expression:
   sort(distinct-values(file:list('/proto', true(), '*.proto') ! concat('/proto/', .) ! doc(.)//*/local-name(.)).

Any tools allowing the same analysis in as terse a form using the actual .proto format?

So XML structured data are accessible to a power of expressing, evaluating and transforming information which is indeed remarkable. The data are gratuitously merged into a single space of interconnected information. Recognition of this peculiar effect presupposes a readiness to view XML not only as a data format, but as several things: an information model, an information processing model (consisting of the definition of expression kinds), technologies implementing the processing model, last and least, a syntax.

But in fact I find that .proto *is* XML. Or perhaps I should say tree-structured information is tree-structured information, same letter in two envelopes. Tree-structured information is what XML and its stack of technologies are about. All it takes to make the equation .proto=XML true (in an operational sense) is define an equivalent XML format (which is easy) and write a parser that parses .proto into XML. The XML power of expression and operation becomes immediately applicable to .proto. For example, the mentioned glossary is obtained, again, by an XQuery one-liner (assumuing the availability of an XQuery function proto:parse, which parses .proto data into XML):
   sort(distinct-values(file:list('/proto', true(), '*.proto') ! concat('/proto/', .) ! proto:parse(.)//*/local-name(.))

(Note I just replaced the function call doc(.) by the call proto:parse(.).)

The most fruitful perspective I can imagine in this context is to shift the focus from syntax to information content, recognize that equivalent information content can be expressed in alternative syntax formats, and lose the best part of one's former interest in the relative merits of those syntax formats, because letters are more important than envelopes. You can start with the syntax you prefer for the particular usecase, and just enter the XML model when it is advantageous - namely when you would like to apply the power of XML technologies to structured information, be it XML, .proto, JSON, csv, and what not.

Hans-Jürgen Rennau


Arjun Ray <arjun.ray@verizon.net> schrieb am 23:21 Samstag, 13.Februar 2016:


On Sat, 13 Feb 2016 21:58:03 +0000, Peter Flynn <peter@s...>
wrote:

| We have just not been very good about making that clear; in my field,
| largely because programmers glaze over when you talk about XML and
| documents.

Mainly because they had already been sold on the idea that XML was a
serialization format.  The problem was to unsell them, and that would
have taken uncommon persuasion skills.

| On the other hand, I have documents exactly like this, with a markup
| payload of 500% of the document text and more, precisely because it adds
| value to the documents for their users far exceeding any minor
| inconveniences of pointy brackets getting in the way.

Yes, but compare that to the overhead in the example presented here:

http://blog.codinghorror.com/xml-the-angle-bracket-tax/

where the "value added", such as it could be, is risibly minimal. (For
a very good reason: the base information payload isn't a "document"
except by definitional legerdemain only.)


_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@l...
subscribe: xml-dev-subscribe@l...
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.