[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: Compression
Forgive me if I'm being dense, but what happens if, as in my example, you have two elements in the DTD/schema *both* of which can optionally be absent (eg (x|y)*). Then how can you tell in the XML instance which of the two you are looking at, without some way of distinguishing them? I don't see how you would be able to distinguish n choices with 1 bit. - sn > -----Original Message----- > From: Jon Cleaver [mailto:j.cleaver@e...] > Sent: Thursday, February 15, 2001 12:25 PM > To: SNedunuri@p... > Cc: xml-dev@l... > Subject: Re: Compression > > > We basically have a mechanism that takes an instance document > and a schema and > does the following: > > 1. Looks in the schema to see if the element can be omitted (minOccurs > attribute >= 1) > 1a. If it can, it reads 1 bit to see if it is there or not. > 1b. If it cannot be omitted, it does not bother to read the > bit, it 'knows' > that the element must be there, otherwise the XML would not > have conformed to > the schema which it purports to belong to. > > 2.. Looks in the schema again to see if the element is > repeatable (maxOccurs > attribute > 1) > 2a. If it is not, then it just expects there to be one > element to read out. > 2b. If it is, it reads a bit after it has decompressed each > element to see if > there is another one on its way. > > In our case there was no requirement to know the number of > repetitions before > hand, therefore a simple bit scheme like this suffices quite well. > > n.b. There are other methods we used to reduce the overhead > from one bit per > repetition to theoretically fractional numbers of bits but > that is going a > little off-topic. > > Cheers, > > Jonathan > > SNedunuri@p... wrote: > > > Hmm, interesting. I want to be sure I understand your scheme, so for > > example, given a DTD spec like > > <!Element SomeTag (Foo|Bar|Lah)*> > > an XML file could be > > <SomeTag> <Foo.../> <Bar.../> </SomeTag> > > or > > <SomeTag> <Bar.../> <Lah.../> </SomeTag> > > I don't see how you could do this with just one bit, if > you're omitting > > tags. > > > > Perhaps a bit vector for the right hand side, which > indicates the presence > > or absence of that element. If its present (1) then its > followed by its > > content. SO the first one would be > > 1:<Foo's content>00, 01:<Bar's content>0 > > and the second > > 01:<Bar's content>0, 001:<Lah's content> > > > > But isn't the repetition still a problem. How do you encode > the number of > > times its repeated with just a single bit? > > > > - sn > > > > > -----Original Message----- > > > From: Caroline Clewlow [mailto:cclewlow@e...] > > > Sent: Thursday, February 15, 2001 10:32 AM > > > To: Jeff Rafter > > > Cc: Danny Ayers; xml-dev@l... > > > Subject: Re: Compression > > > > > > > > > As regards compression - one option that *we* have looked at > > > is that of the > > > schema being known at both the receiving and transmitting > end of the > > > communication. In that case a method can be used whereby the > > > transmission of > > > the tags themselves are not required. A single bit can be > > > used to indicate the > > > presence or otherwise of an optional item, repeated items can > > > also be indicated > > > by 0 or 1 depending on whether they are repeated or not. > > > Then it's just a case > > > of encoding the actual content between the tags. > > > > > > The XML document can be rebuilt at the receiving end by > > > stepping through the > > > schema checking the data for the content that is present. > > > > > > Hope the above description makes some sense ! > > > > > > Regards > > > > > > Caroline > > > > > > > > > Jeff Rafter wrote: > > > > > > > > A harder (but quite interesting) alternative would be to > > > have a pointer on > > > > > http://zip to a document specifying the (de)compression > > > algorithm, from > > > > > which B could build its own native converter. > > > > > > > > I don't know if it would be recreating the wheel-- or > rather if this > > > > compression system would be re-using the wheel-- but it > seems that > > > > encryption (public key/private key) fits very well in this > > > model. This is > > > > plain in scenario three: > > > > > > > > Scenario 3 : > > > > > > > > A tells B (in straight XML) that it has some data for it, > > > and it wants to > > > > encrypt it in the format found at URI http://encrypt with > > > the decryption key > > > > found at http://a/decrypt > > > > B replies - please wait > > > > B goes to the URL pointed to at http://encrypt, downloads > > > and installs the > > > > decryption algorithim (if not already present) and then > > > downloads the public > > > > key from http://a/decrypt > > > > B tells A, I'm ready > > > > A sends encrypted binary... > > > > > > > > Jeff Rafter > > > > > > > > > ------------------------------------------------------------------ > > > > To unsubscribe from this elist send a message with the > single word > > > > "unsubscribe" in the body to: xml-dev-request@l... > > > > > > > > > ------------------------------------------------------------------ > > > To unsubscribe from this elist send a message with the single word > > > "unsubscribe" in the body to: xml-dev-request@l... > > > > > > > ------------------------------------------------------------------ > > To unsubscribe from this elist send a message with the single word > > "unsubscribe" in the body to: xml-dev-request@l... >
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|