[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] An Architecture for Limericks (was Limericks, Stupidity, and Reality)
Computer scientists don't write limericks without designing a general architecture for limericks. Suppose I wanted to write the limerick constraint testing system that Mike wants. I would probably want to separate concerns among several different modules, and I would like to make it easy to look at the results of each step. I think that I am unlikely to persuade my limerick authors to write their limericks in finely grained markup, so I suspect they will give me texts like this: There was a young lady named Bright Whose speed was much faster than light. She set out one day, in a relative way, And returned on the previous night. With a Perl script, it is fairly easy to mark up the lines of this poem, and I would like to do this before running my syllabification, because it is quite likely that the syllabification engine will lose my whitespace, which is important for identifying the lines. If I already know this is a limerick, I could choose to divide this up into long and short lines from the beginning: <limerick> <long>There was a young lady named Bright</long> <long>Whose speed was much faster than light.</long> <short>She set out one day,/short> <short>in a relative way,</short> <long>And returned on the previous night.</long> </limerick> For testing the rhyme scheme, further markup is probably not helpful. Also, I am probably not going to keep any markup I use for testing whether a line scans, so I will use a schema adjunct to declare the constraints on this poem. I use standard poetry terms for the metrical feet - here is a table of the terms, compared with their representation in Cowan Normal Form (CWF): iamb da-DUM anapest da-da-DUM tertius paeon da-da-DUM-da Here is a Schema Adjunct that declares the constraints on a limerick: <schema-adjunct targetNamespace="http://www.example.com/limerick" xmlns="http://www.schema-adjuncts.org/namespaces/2001/07/saf"> <global> <rhymes> <line select="limerick/long[1]" /> <line select="limerick/long[2]" /> <line select="limerick/long[3]" /> </rhymes> <rhymes> <line select="limerick/short[1]" /> <line select="limerick/short[2]" /> </rhymes> </global> <element context="short"> <scans> <sequence> <choice> <iamb /> <!-- da dum --> <anapest /> <!-- da da dum --> </choice> <choice> <iamb /> <anapest /> </choice> </sequence> </scans> </element> <element context="long"> <scans> <sequence> <choice> <iamb /> <anapest /> </choice> <choice> <iamb /> <anapest /> </choice> <choice> <iamb /> <anapest /> <tertius.paeon /> <!-- da da dum da --> </choice> </sequence> </scans> </element> </schema-adjunct> So far, I have written no code, so I have no software that will tell me whether a line scans or whether a set of lines rhyme. However, I do have a way of declaring the structure of a poem in a Schema Adjunct, and I can use it to describe the structure of other kinds of poems as well. The specific algorithms for testing these constraints is up to the implementations, but I have also modularized the implementation. I have also made the implementation easier to test - I can write test suites that take sets of words that are presumed to rhyme or not to rhyme, and see whether my system handles them correctly. I can do the same for scansion. Now suppose that more than one rhyming engine exists, and more than one scansion engine exists. Do these engines agree? If not, how do they disagree? Are there bugs in one or both of the engines, or are their answers both reasonable? If the answers to these questions are important to me, a concrete representation of the output of the engines may be very helpful. Without it, I would have to compare the source code of the systems, or try to create exhaustive sets of tests that would give me indications of how they work. For instance, suppose I ask the software to test whether the following scans: <long>There was a young lady named Bright</long> If it says that it does not, I may not be sure whether there is a bug in my software or an error in the line of the limerick. If there is a bug in my software, I may not know if the bug is in the syllabification per se, in the stress assigned to syllables, or in the comparison of the syllabification and stress to that required of a long line in a limerick. For testing purposes, output like the following can be very helpful indeed: <long> <da>There</da> <dum>was</dum> <da>a</da> <da>young</da> <dum>la</dum> <da>dy</da> <da>named</da> <dum>Bright</dum> </long> Not only is this useful for testing, it is also useful for defining interfaces. For instance, I might well have a system that takes the above representation and compares it to the declared scansion for the long line of a limerick, as given in the above schema adjunct. This would be very simple to write. In general, when designing complex systems, I think it is very helpful to think in terms of declarative, testable architectures. Jonathan
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|