[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: ASN.1 is an XML Schema Language (Fix those lists!)and Bina
Robin Berjon wrote: >> I think that some of the sophisticated encodings like PER are very >> hard to get right and complete, too (I have never looked into these >> encodings, so have no first-hand experience here). > > > I don't think PER qualifies as "very hard", but yes it certainly > qualifies as much harder than BER. > Figuring out the code that generates PER encodings of things is tricky due to all the rules about getting rid of redundant things, but once you've done it all, the result is quite simple code, you'll be glad to know. PER encodings aren't all that complex - they're just finely tuned :-) Code that reads PER might look something like: Person PERReadPerson (InputStream in) { Person p = new Person (); p.name = PERReadArbitraryLengthString (in); p.age = PERRead8BitUnsignedInteger (in,0,200); // age was constrained to the range 0..200, so // an 8 bit unsigned integer is used. // Note that if it was given the range 100..300 // then we would read in an 8 bit unsigned int // from 0 to 200, and add 100 to it. p.phone = PERReadShortString (in,10); // The phone number field is a string of up to 10 // chars, so uses a 'short string' encoding with a length // byte or something like that p.geneticFingerprint = PERReadFixedLengthLimitedAlphabetString (in, 16, "ATGC"); // the genetic fingerprint is precisely 16 chars, with the // character set just being A, T, G, and C. The string is // encoded with two bits per character, so four chars per // byte, so four bytes. // Later versions of the spec may add more fields than we know // about, so we have a trailing extension flag boolean hasExtensionData = PERReadBoolean (in); if (hasExtensionData) { // skip it int length = PERReadInteger (in); in.seek (length); } } Writing a general BER parser is, I reckon, probably a shade simpler than an equivelant XML parser, since you don't have entities and declarations and the differences between attributes and elements to worry about. All the children of your node are laid out in order in the same format, and every string is prefixed with its length (albeit possibly broken into chunks, each with a length prefix and a "more follows" flag, if the string was generated by streaming). A decoder for our Person record in BER might look more like: Person BERReadPerson (InputStream in) { Person p = new Person (); p.name = BERReadString (in); p.age = BERReadInteger (in); if (p.age < 0 || p.age > 200) throw new ValueConstraintException (...); p.phone = BERReadString (in); if (p.phone.length() > 10) throw new ValueConstraintException (...); p.geneticFingerprint = BERReadString (in); if (!p.geneticFingerprint.match ("^[GATC]{16}$")) throw new ValueConstraintException (...); // Later versions of the spec may add extra fields that we don't // know about. Skip past them until the end of sequence marker // is found. BERSkipToEndSequenceMarker (in); } Note that BERReadString etc. do not trust you that there will be a string due from the stream, as the PER ones do; they read the type code and if there's actually an integer or a sequence there, they throw an exception. As I gather it, PER was devised at a time where ASN.1 was coming under fire because the most widely used encoding, BER, was "wasteful and bloated" since it contained the type and length codes on everything. PER is ASN.1's equivelant of "binary XML" ;-) ABS
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|