[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Lark 0.97 Available
Lark 0.97 is now available at http://www.textuality.com/Lark/ Lark now + is smaller! More code, but the class files are back down to 45k. + is faster! About 200K/second on my mouldy old P100, i.e. Lark parses Jon's Old Testament file (3.88M) in under 20 seconds - this is just the event-stream & syntax check. If you want to build complete trees in memory, parsing for any document slows down a lot, obviously. + is free of case-folding. + checks for duplicate attribute names attached to one element. + reads multiple attlist declarations, collisions go to the first. + won't let you &refer-to; an external text entity in an attribute value - what the spec says, and James says this is a good idea and he's usually right. + reads the external DTD subset if the toggle lark.processExternalEntities(true) has been set (and, of course, if a usable SYSTEM ID has been provided). + has a new version of the central readXML method, that allows you to specify a base URL for the document entity; necessary for relative-URL constructions such as <!DOCTYPE foo SYSTEM "foo.dtd" > to work. + has another Entity member java.net.URL mBaseURL; constructor argument, and set/get function pair, to retrieve the URL associated with an external entity. + does full PE processing, including external PE's. + as a result, class Entity has a new member boolean mPE; with a new argument on its constructor and a new method public boolean isPE(). - doesn't do conditional sections, still. + upon encountering a reference to an undeclared entity, checks to see if the declaration might have been external and bypassed; this can happen when (a) you have turned off mProcessExternalEntities, and (b) there is an external DTD subset, or (c) there is a ref to an external PE in the internal subset at a point where a whole markup declaration might be recognized. If so, Lark turns off draconian error handling and allows processing to continue; however, Handler has a new method, doWarning(), that gets called in this situation. + processes entity/char references correctly in <!ATTLIST default values. + has had the Handler.doAttlist() method changed - now takes an Object[] instead of String[] argument, since the default value is now a Text as opposed to a String, because of entities in defaults. + does entity declaration processing properly, doing Henry Thomson's hideous example from the spec Appendix C, and another, just as nasty, that I have cooked up for the next release of the spec. Blecch. + has a big bug-fix: it turns out pre-0.97 Lark almost never parsed <!doctype declarations properly, botching SYSTEM & PUBLIC identifiers; so the Handler.doDoctype() method has been rebuilt, since I can't imagine anybody ever actually did anything useful with it. + has a change to Handler.doSyntaxError() (sorry), which now has a third arg, char c, that gives the character that caused Lark to decide the doc wasn't well-formed... in lots of cases, this turns out to be real useful. Others not. Cheers, Tim Bray tbray@t... http://www.textuality.com/ +1-604-708-9592 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|