[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Word and XML (was: XML standards coherency and so forth)
Sean Mc Grath writes: > RTF doesn't map well to XML -- even very low level -- formatting > oriented XML -- because of the way RTF is structured. > > It is stack based and allows structures to overlap:- > > \b1 bold \i1 bold italic \b0 italic \i0 plain > > Matching up the on/offs:- > <b> bold <i> bold italic </b> italic </i> plain > > invalid XML (or indeed SGML) because of the overlaps. This is actually quite simple to handle algorithmically by maintaining a stack and doing a pushback when tags aren't nested: RTF Tags Stack ------------------------ \b1 <b> (b) \i1 <i> (b i) \b0 </i></b><i> (i) \i0 </i> () You'd need only four or five lines of code to handle it -- just walk back on the stack to the nearest matching state (closing all open tags), then reopen everything except what you just closed. I'm not saying that you'll always get valid HTML, but at least the tags will be properly nested. All the best, David -- David Megginson david@m... http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|