Re: Compiled XML
Dear All, Let me explain the ISO/MPEG-7 context and the solution that MPEG developped to handle this "compression" issue. Hope this can be of your interest. MPEG-7 THE CONTEXT MPEG-7 is a very large XML language (700 XML Schema types) to define audiovisual metadata. It is the result of the fruitfull effort of many companies and national bodies all around the world. MPEG-7 is composed of several parts : Part 1 - Systems Part 2 - DDL Part 3 - Visual descriptor Part 4 - Audio descriptor Part 5 - Multimedia Description Scheme Part 6 - Conformance MPEG-7 main goal is to describe audiovisual content at different level of granularity ranging from very low level description (mean color, aso..) to high level description (semantic relationship, actor names, copyright information, etc..). MPEG-7 adopted XML to represent these metadata and choose XML schema as its schema language. However, because bandwith is very expensive in the broadcast industry and because MPEG-7 description are possibly very large, MPEG-7 definitively had to define a "compiled version of XML". MPEG-7 BINARY FORMAT - BiM The part 1 (Systems) of the standard defines a Binary format for XML documents called BiM. BiM relies on the XML schema definition of an XML language to automatically generate a very compact binary format of that language. Elements and attributes are encoded with few bits, while values (leaves) are encoded using dedicated encoder (IEEE-754 for float, UTF_8 for strings, ...). BiM supports most of the XML Schema features including sub-typing (xsi:type), substitution groups, aso. BiM is generic as it can deal with any XML language, not only MPEG-7. As its main features, BiM generates a very compact representation of an XML document that includes information to considerably speed-up search or filtering. It is streamable which means that document deltas can be send to update a remote version of an XML document. This simple encoding scheme have proved to be very efficient. On recent tests a BiM decoder is between 10 and 30 times faster than Xerces C SAX parser for producing SAX-events. In case of direct parsing it can be between 20 to 100 times faster. File size can be reduced up to 80%. BiM performs as well on small files as in large files and it can be combined with zip to outperfom zip compression by a factor of 2 to 5. As a conclusion, BiM technology is very well suited to environment where bandwith is expensive or where large number of XML documents have to be parsed. It is very well dedicated to the TV or the mobile industry. The MPEG-7 (ISO 15938) will be published in few weeks as an ISO international standard. You can find more information on the official MPEG website: http://mpeg.telecomitalialab.com/ Some information about BiM can be found on : http://www.expway.tv/bim/bim.html Best regards, Claude. _________________ Michael Rys wrote: > > SQL Server 2000 uses a tokenized, binary XML format if it talks to an > OLEDB 2.6 or higher provider that then turns it into XML (in the stream > mode). So yes, binary XML formats do work and are being widely deployed. > They save space (although I agree that using compression on the wire is > normally better), they avoid to/from text serialization etc. Only > problem is that any standardized format will most likely not be useful > for most use cases since it will not cover the specific needs (it would > be a compromise and thus basically useless). > > There are several papers at WWW9 and WWW10 on general XML compression > and ATT did some research on XMill. Also some tools basically use the > DOM API (some persistent DOMs), SAX event streams (push) or XMLReader > (pull) interfaces to avoid the serialized form. > > Best regards > Michael > > > -----Original Message----- > > From: Alaric Snell [mailto:alaric@a...] > > Sent: Wednesday, March 27, 2002 5:02 AM > > To: Mike Champion; xml-dev@l... > > Subject: Re: Compiled XML > > > > On Wednesday 27 March 2002 12:53, you wrote: > > > 3/27/2002 6:50:59 AM, Alaric Snell <alaric@a...> wrote: > > > >Hi, Mike! How's the weather? :-) > > > > > > Uhh, lousy, especially compared to Spain last week :~) > > > > Shame, it's getting quite nice here in London now... > > > > > The response on this list to the Binary XML discussions > > > has typically been "sounds plausible in theory, I've > > > never seen it work well enough in practice to adopt." > > > I don't have an axe to grind in this discussion other than > > > wanting to answer a very frequently asked question for > > > a newcomer. Could you share some empirical, quantitative > > > data from real success stories using these techniques? > > > Did you really make significant speed improvements without > > > memory bloat, or significant reduction of memory requirements > > > without additional processor horsepower? > > > > Well, there was some comment from the Coccoon2 people about a > serialised > > SAX > > event stream format, and the ASN.1 folks have been having fun > comparing > > BER/PER (ASN.1 encodings) with something called Megaco, which is an > XML- > > like > > tree structured textual description too... > > > > I'll implement what I described in an earlier post, since it'll be a > > useful > > open source tool anyway I'm sure, and do some file size / run time > tests > > at > > the weekend. > > > > ABS > > > > -- > > Alaric B. Snell > > http://www.alaric-snell.com/ http://RFC.net/ > http://www.warhead.org.uk/ > > Any sufficiently advanced technology can be emulated in software > > > > ----------------------------------------------------------------- > > The xml-dev list is sponsored by XML.org <http://www.xml.org>, an > > initiative of OASIS <http://www.oasis-open.org> > > > > The list archives are at http://lists.xml.org/archives/xml-dev/ > > > > To subscribe or unsubscribe from this list use the subscription > > manager: <http://lists.xml.org/ob/adm.pl> > > ----------------------------------------------------------------- > The xml-dev list is sponsored by XML.org <http://www.xml.org>, an > initiative of OASIS <http://www.oasis-open.org> > > The list archives are at http://lists.xml.org/archives/xml-dev/ > > To subscribe or unsubscribe from this list use the subscription > manager: <http://lists.xml.org/ob/adm.pl> -- ______________________________________________ ** NEW ADDRESS ** - - - - - - - - - Claude Seyrat EXPWAY 17, rue du Pont aux Choux 75003 Paris, France T: +33 1 44 54 29 28 M: +33 6 07 66 26 63 F: +33 1 44 54 90 49 E: claude.seyrat@e... www.expway.tv
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format