[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Genx
Tim asked me to send this e-mail to the list as well: ----- Original Message ----- From: "Tim Bray" <tbray@t...> To: <xml-dev@l...> Sent: Tuesday, January 20, 2004 1:49 AM Subject: Genx > At one point during the discussions about Atom and well-formedness, I > offered to cook up some libraries for safely and efficiently writing > guaranteed-well-formed XML; it seems that the world is well-provided > with these for Java, but several people wrote me saying such a thing > would be really handy at the C level; libxml2 being OK but too big and > complicated for most people's purposes in this particular regard and > also suspected of a memory leak. > > So I sketched out a design, see the write-up at > http://www.tbray.org/ongoing/When/200x/2004/01/19/HeresGenx and have at > it. -Tim Since I have dealt with a similar issue (on another platform), I checked your API. My comments are not necessarily meant to be criticizing the API, I am just highlighting things I did differently or I don't understand (in chaotic order): (I left out anything I happened to see discussed already) - About genxSetAllocator: I would think a Free and Realloc pointer should be provided too. - Does the API require that all input is checked for UTF-8 (or UTF-32) encoding correctness (and Name correctness) even if that were already know to the application? - Providing the extra functions to do that might be enough? - About genPI: Why not have two arguments, let's call them target and data? The first one - target - is a name, the other one isn't, so different checking rules apply. - Why not provide for escaping (of illegal input) by default? If you want to write a comment or attribute, the assumption should be that you want to write it correctly - so why not do the necessary escaping for the programmer? Different rules apply for comments, attributes, character data (and entity values - but there is no API for them). - Allow choice of quote character for attributes, allow (three) options for ampersand escaping: none, numeric char ref, amp entity - genxEndStartTag is not really necessary, as this can be derived from internal state tracking. - Add Formatting features: - Style: none or indented. Can be chosen for each nesting level dynamically. E.g. simple bottom elements with character data might have no indentation, but their parent elements might have it. - Set indentation amount - Set indentation (starting) level (e.g. for writing heavily indented fragments) - genxNewLineIndented function - to add a new line and whitespace according to level - For this purpose one might need some genxStartContent(formattingStyle) function, to be called whenever the attributes are finished (if any), so that formatting is determined at this level (downwards) until overridden at a lower level - genxEndTag: add shortForm (boolean) argument, so that the element, if empty, can be written as <element/> - Add genxStart/EndCDATA functions, and allow for un-escaped output in between - About genxCharacter: does this write a character reference? - About genxCheckName: should differentiate if namespaces are turned on or not, as a colon is legal in one case, but not the other. - Should one support DTD output (declarations)? For instance for writing an internal subset. - What about a genxXML function that writes the XML declaration? - How to declare a default namespace? One cannot pass an empty prefix as this means one will be created. - Are namespace declarations tracked internally? What about an API for: - find most recently declared prefix currently active for a given URI - find all prefixes active for a given URI Regards, Karl
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|