|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Constrain the Number of Occurrences of Elements inyour XML
Um, have you noticed the consequences of setting maxOccurs="30000" in today's validators? I've seen out-of-memory errors with maxOccurs="1000". There is a way to avoid the quadratic blowup (probably more than one). I talked about one in: http://jroller.com/comments/bobfoster/FullSpeedAhead/derivatives_of_bounded_repitition and I believe C. M. Sperberg-McQueen is giving a presentation at the next Extreme that covers the topic, but right now, that's really not good advice. Bob Foster http://xmlbuddy.com/ Roger L. Costello wrote: > Hi Folks, > > Below I have jotted down a few thoughts regarding XML Schemas which > permit an unbounded number of occurrences. Namely, I recommend against > using maxOccurs="unbounded" in an XML Schema. I am interested in > hearing your thoughts on this. /Roger > > > > Constrain the Number of Occurrences of Elements in your XML Schema > > *by Roger L. Costello* > August 5, 2005 > > > Constrain your Data! > > In this message I will argue that you should never create XML Schemas > that permit an unbounded number of occurrences. > > There are two ways in XML Schemas to permit an unbounded number of > occurrences. The first way is to explicitly state that you are > permitting an unbounded number of occurrences. For example, this > declaration says that Bookstore can contain an unbounded number of Book > elements: > > <element name="Bookstore"> > <complexType> > <sequence> > <element name="Book" type="..." *maxOccurs="unbounded"*/> > </sequence> > </complexType> > </element> > > The second way of permitting an unbounded number of occurrences is less > obvious. Unboundedness occurs implicitly when you create a recursive > structure. In this example there is no limit to the depth of the Section > elements. That is, a Section can contain a Section which contains a > Section which contains a Section ... > > <element name="Section" type="SectionType"/> > > <complexType name="SectionType"> > <sequence> > <element name="Title" type="..."/> > <element name="Section" type="SectionType"/> > </sequence> > </complexType> > > Both of the above forms permit an unbounded number of occurrences. I > recommend that you never use either form. That is, never declare an > element with maxOccurs="unbounded", and never declare a recursive > structure. Below I explain why. > > > Writing a Journal Article? Your Word Count is Limited! > > The situation with specifying the number of occurrences of an element in > an XML Schema is analogous to the situation with specifying the number > of words authors can use in an article. > > Suppose that you want to write an article for a journal. How many words > can you use in your article? All journals have an upper limit on the > number of words that you can use. Why don't the journals set the word > limit to unbounded? Answer: there are editors that have to check the > articles for correctness, readability, etc. The editors have limited > resources (i.e., time). Thus, it is necessary to limit the length of the > article. Perhaps at a later date the journal will increase the word > limit (perhaps they hire some full-time editors). But they always have a > definite upper limit. They never allow articles of unbounded length. The > reason is because of limited resources. > > > Error! Infinite Loop! > > The situation with specifying the number of occurrences of an element in > an XML Schema is analogous to an infinite loop in programming languages. > Why are infinite loops deemed "bad" in programming languages, yet > unbounded occurrences are embraced in data? > > Let's see why infinite loops are bad in programming languages. Suppose > that a program has a loop, and a computer begins to process the loop. It > requires a certain amount of resources (memory, cpu cycles) for the > computer to perform one iteration. Two iterations will require a bit > more resources. Three iterations require still more. ... Infinite > iterations require infinite resources. Thus, infinite loops are bad > because they require infinite resources. > > The situation is analogous with data. Consider the Bookstore declaration > above. It declares that an unbounded number of Book elements are > permitted within Bookstore. A program that must process XML instances > conforming to the declaration must have the necessary resources (memory, > cpu cycles). To process one Book element will require a certain amount > of resources. To process a second Book element will require a bit more > resources. A third book will require still more resources. ... Infinite > Books require infinite resources. Even though XML instance documents are > always finite, the schema indicates that there is a "potential" for an > infinite number of Book elements. A program that is designed to process > "any" XML instance document that conforms to the schema must therefore > have an infinite amount of resources. > > > Okay, then what Value should I use for maxOccurs? > > "Suppose that I anticipate that Bookstore will never have more than > 30,000 Books, so I set maxOccurs='30000'. After some time the > requirements change and BookStore now needs to be able to hold 35,000 > Books. Won't I have to change the Schema every time my needs change? > Wouldn't it be easier if I simply declared maxOccurs='unbounded'?" > > Answer: yes, you will need to change the Schema whenever your > requirements change. Yes, it is easier to simply declare > maxOccurs='unbounded'. But don't do it! The number that you use for > maxOccurs should be as big as your programs are willing and able to cope > with, and no more. If at some point the number of actual books exceeds > that number then they must either (1) extend your program's resources to > handle the expanded number, or (2) refuse to allow more books. > > > Recap > > 1. Don't use maxOccurs="unbounded" > 2. Don't use recursive constructions > 3. Set maxOccurs to a number no larger than the amount of resources > you have available
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








