[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] simple answer Re: Handling/Parsing/Validating multipleXML Stat
Hello Dan, let me add something to the responses your question has created : ) Dan White wrote: ><foo> <bar> woof </bar> </foo> <foo> <bar> woof </bar> </foo> > >with "foo" being the root tag > > ... >>Yes, preprocessing is a possibility, but how would one do it ? >> >>How do you locate where one statement ends and the next begins ? >> >> >> Underneath the craziness, XML is simple to parse. Some theoreticians have called it a parentheses language. This insight may actually be helpful for your problem given a way to identify start and end tags and cdata sections(check the grammar) the following algo will do: 1 initialize a counter c to 0 2 traverse the input, till the first start tag. 3 c++ 4 while c > 0 4.1 traverse input until next '<' (this is always a tag, it is not allowed in attributes or anywhere else) 4.2 if starttag, then c++ else if endtag c-- else if cdata skip to next "]]>" // cdata section else skip to next '>' // processing instruction or comment 5 cut here. It gets simpler and faster if you can assume that there are no processing instructions, comments, cdata etc. This only works for well formed XML fragments. No doubt, if you had your own parser, you could tell it to just read the first element - there are also some libraries that support this directly, but I don't know of any in C or C++. Hope this helps. cheers, Burak
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|