Re: Push and Pull?
On Fri, Jan 25, 2002 at 09:25:15PM -0800, Sterin, Ilya wrote: | pull - the program controls what happens to the data the parser returns | push - the parser controls what happens to the data returns, since it | maintains the while loop. Most programs have to maintain state. There are two primary ways to do this, via the heap or via the stack. The most convenient way of maintaining state is by using the program stack. In this case, you have functions which call other functions; local variables and return results are kept on the call stack. The compiler (or interpreter) performs all of the memory management, pushing all of the arguments and local variables for a function call onto the program stack before the function is invoked and then popping these items off the stack when the function terminates. In this way the programmer doesn't have to worry about managing memory. The other way of maintaining state is via heap. In this method, the program allocates memory and gets a pointer back. The program can then pass around the pointer as needed to access the allocated memory. When the program is done with the memory, it is the program's responsibility to free the allocated memory. Many languages provide for garbage collection so that the programmer doesn't have to worry about freeing allocated memory. In most object oriented languages, memory is encapsulated as an object. The problem here is that if you have two "objects" interacting with each other, only one of them can use the stack at any given time. So, if P means Producer, and C means consumer, you have two different flow control models to choose from.... "pull" "push" P P C C PPP PPP vs CCC CCC P CCCCC CCCCC P P PPPPP PPPPP P PPP CCCCCCCCCCCCC PPP PPPPPPPPPPPPPPPPPP CCCCCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCC A B C D A B C D Call stack over time Call stack over time A. The program starts the parser. B. The first node arrives. C. The second node arrives D. The parser shuts-down. Time as seen from the pull model: A. The producer's initialization function (aka Parse) is pushed onto the stack, it does what it has to do. It allocates memory on the heap (to save its state) and then returns this chunk of allocated memory (usually as a Document object). B. The consumer proceed along, and eventually is ready for the first node. So, it asks the producer for the next node. The producer is pushed onto the stack and then uses the pointer (to memory on the heap) to figure out what node to return. Then, control is once again given back to the consumer who can dispatch with the node. C. The same pattern as B, where control moves from the consumer (who keeps its state on the stack) to the producer (who reads its state from the heap). D. The consumer figures out that it doesn't need any more nodes and shuts down the parser. In this case, the parser is pushed onto the stack once more, given it's heap memory. The parser then cleans up any opened files and frees the memory. Control returns to the consumer. What's important is that the consumer can at all times use the stack to maintain state; where the producer must use the heap (via an object). Time as seen from the push model: A. The consumer loads the parser and provides the parser with a call-back functions. The consumer can also provide a data structure; usually memory allocated from the heap. The producer then initializes. B. When the producer is ready, it sends a node by pushing the consumer's call-back function onto the call stack. Then, when the consumer's call-back is finished, it is popped from the stack. So, if the consumer wishes to track state between event notifications, it must use the heap-allocated pointer provided earlier. C. This is exactly the same as B, only that between B and C, the producer is in-control of the process. D. When the producer wants to stop, it simply closes any resources it may have, and then returns. In this way, any state maintained by the producer is automatically recovered as it is popped from the call stack. Note, that the consumer will most likely then free the dynamic memory used by the heap-allocated pointer provided ealier. Quite clearly, using the stack to maintain state is much much simpler than using the heap. Thus, the "pull" model is better for the consumer. I used "consumer" and "producer" everwhere. This is beacuse the application is the "consumer" when getting input, and the producer when making output. So, one may ask, what is better "push" or "pull". And the answer begs the question, "which one, the producer or the consumer, should be easier to write?" If the answer is the consumer, then you want a "pull" interface. If the answer is the producer, then you want the "push" interface. Thus, consider the information flow below. FILE -> EVENTS -> EVENTS -> FILE (PARSER) (APPLICATION) (EMITTER) In this case, you want to make the application's life the easiest. Thus, the PARSER should have a "pull" interface, and the EMITTER should have a "push" interface. (I use emitter as the opposite process of parser). Now... on to your question... | Is it safe to say then that the underlying DOM parser is | rather a pull model, since it really maintains the loop. Yes. Under the sheets, the DOM parser is most likely using standard input/output library. This library is "pull" for input (read) and "push" for output (write). So, yes. The DOM parser is using a "pull" model from the standard input-output library. | Or is it not, because it's build based on the pull model, | the the DOM (processor) is a program that has control of | the loop and actually retains it in memory? This sounds paradoxical beacuse you've switched contexts 1/2 way through your question. ;) In DOM's implementation, it *uses* a pull interface from the standard input. It also has a pull interface for it's consumers. How the DOM maintains it's information (either reading it all into memory up-front, or doing it incrementally) is an implementation detail and does not invalidate the push vs pull interface distinction. What is important to label is the boundary between the producer and the consumer. Kind Regards, Clark -- Clark C. Evans Axista, Inc. http://www.axista.com 800.926.5525 XCOLLA Collaborative Project Management Software
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format