[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Spelling checking templates (Was: RE: Re: attribute cl
--- "Matthew L. Avizinis" <mla at gleim dot com> wrote: > I'm happy knowing that there are widely varying differences of > opinion on this matter. > So, Dimitre, how precise is precise? If I were to define closest > match to be words that contain all of the letters with one > transposition, e.g. hte for the, or the spelling is correct except > for one letter, e.g. mofe for mode, would that, iyo, be precise > enough? > Of course a spelling checker might prevent many of these kinds of > errors in data entry, but it would still be, I believe, an > interesting exercise to be able to catch these kinds of errors if >data was entered without a spellchecker abvailable (this would be > another type of error I would include later because it seems like it > would be more difficult to check for). > Any more help, suggestions, (or even code)? > thanks, > > > > > > Matthew L. Avizinis <mailto:mla@xxxxxxxxx> > > > Gleim Publications, Inc. > > > 4201 NW 95th Blvd. > > > Gainesville, FL 32606 > > > (352)-375-0772 > > > www.gleim.com <http://www.gleim.com> > > > > > > Can be done in XSLT, but first you have to define precisely > > "_closest match_". Hi Mathew, This is quite straightforward to do using FXSL. Please, find bellow the code that solves your particular problem, and that may be used as part of a spelling checker, implemented in XSLT. Should I mention, that I'm using FXSL here? :o) Suppose you have the following source xml: words2.xml: ---------- <elements> <element cana="1"/> <element cna="2"/> <element an="3"/> <element con="4"/> <element cbb="5"/> </elements> The result of applying the transformation presented bellow will be: <elements> <element can="1" /> <element can="2" /> <element can="3" /> <element can="4" /> <element /> </elements> As you can see, deletion, replacement and adding of a single character, as well as transposing two adjacent characters is corrected. Two replacements are not handled. You may play with any other combinations of attribute names. Here's the transformation: spelling.xsl: ------------ <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:vendor="urn:schemas-microsoft-com:xslt" xmlns:delLetter="f:delLetter" xmlns:addLetter="f:addLetter" xmlns:addLetterSingle="f:addLetterSingle" xmlns:repLetter="f:repLetter" xmlns:repLetterSingle="f:repLetterSingle" xmlns:transPair="f:transPair" exclude-result-prefixes="vendor delLetter addLetter repLetter transPair repLetterSingle addLetterSingle" > <xsl:import href="str-foldl.xsl"/> <xsl:output omit-xml-declaration="yes" indent="yes"/> <delLetter:delLetter/> <addLetter:addLetter/> <repLetter:repLetter/> <transPair:transPair/> <repLetterSingle:repLetterSingle/> <addLetterSingle:addLetterSingle/> <xsl:variable name="validChars" select="'abcdefghijklmnopqrstuvwxyz'"/> <xsl:template match="/"> <xsl:variable name="vrtfCloseWords"> <xsl:call-template name="closeWords"> <xsl:with-param name="pWord" select="'can'"/> </xsl:call-template> </xsl:variable> <xsl:variable name="vCloseWords" select="vendor:node-set($vrtfCloseWords)/*"/> <elements> <xsl:for-each select="/elements/element"> <xsl:copy> <xsl:for-each select="@*[name()=$vCloseWords]"> <xsl:attribute name="can"> <xsl:value-of select="."/> </xsl:attribute> </xsl:for-each> </xsl:copy> </xsl:for-each> </elements> </xsl:template> <xsl:template name="closeWords"> <xsl:param name="pWord"/> <xsl:call-template name="delLetterWords"> <xsl:with-param name="pWord" select="$pWord"/> </xsl:call-template> <xsl:call-template name="repLetterWords"> <xsl:with-param name="pWord" select="$pWord"/> </xsl:call-template> <xsl:call-template name="addLetterWords"> <xsl:with-param name="pWord" select="$pWord"/> </xsl:call-template> <xsl:call-template name="transPairWords"> <xsl:with-param name="pWord" select="$pWord"/> </xsl:call-template> </xsl:template> <xsl:template name="transPairWords"> <xsl:param name="pWord"/> <xsl:variable name="vftransPair" select="document('')/*/transPair:*[1]"/> <xsl:variable name="vrtf-accum"> <accum> <position>1</position> <word><xsl:value-of select="$pWord"/></word> <closewords></closewords> </accum> </xsl:variable> <xsl:variable name="vaccum" select="vendor:node-set($vrtf-accum)/*"/> <xsl:variable name="vrtfResults"> <xsl:call-template name="str-foldl"> <xsl:with-param name="pFunc" select="$vftransPair"/> <xsl:with-param name="pA0" select="$vaccum"/> <xsl:with-param name="pStr" select="$pWord"/> </xsl:call-template> </xsl:variable> <xsl:copy-of select="vendor:node-set($vrtfResults)/closewords/*"/> </xsl:template> <xsl:template match="transPair:*"> <xsl:param name="arg1" select="/.."/> <!-- A0 --> <xsl:param name="arg2"/> <xsl:variable name="vPos" select="$arg1/position"/> <xsl:variable name="vWord" select="$arg1/word"/> <xsl:variable name="vCloseWords" select="$arg1/closewords"/> <xsl:variable name="vNewWord" select="concat(substring($vWord, 1, $vPos - 1), substring($vWord, $vPos + 1, 1), $arg2, substring($vWord, $vPos + 2) )"/> <position><xsl:value-of select="$vPos + 1"/></position> <word><xsl:value-of select="$vWord"/></word> <closewords> <xsl:copy-of select="$vCloseWords/*"/> <word><xsl:value-of select="$vNewWord"/></word> </closewords> </xsl:template> <xsl:template name="delLetterWords"> <xsl:param name="pWord"/> <xsl:variable name="vfDelLetter" select="document('')/*/delLetter:*[1]"/> <xsl:variable name="vrtf-accum"> <accum> <position>1</position> <word><xsl:value-of select="$pWord"/></word> <closewords></closewords> </accum> </xsl:variable> <xsl:variable name="vaccum" select="vendor:node-set($vrtf-accum)/*"/> <xsl:variable name="vrtfResults"> <xsl:call-template name="str-foldl"> <xsl:with-param name="pFunc" select="$vfDelLetter"/> <xsl:with-param name="pA0" select="$vaccum"/> <xsl:with-param name="pStr" select="$pWord"/> </xsl:call-template> </xsl:variable> <xsl:copy-of select="vendor:node-set($vrtfResults)/closewords/*"/> </xsl:template> <xsl:template name="repLetterWords"> <xsl:param name="pWord"/> <xsl:variable name="vfRepLetter" select="document('')/*/repLetter:*[1]"/> <xsl:variable name="vrtf-accum"> <accum> <position>1</position> <word><xsl:value-of select="$pWord"/></word> <closewords></closewords> </accum> </xsl:variable> <xsl:variable name="vaccum" select="vendor:node-set($vrtf-accum)/*"/> <xsl:variable name="vrtfResults"> <xsl:call-template name="str-foldl"> <xsl:with-param name="pFunc" select="$vfRepLetter"/> <xsl:with-param name="pA0" select="$vaccum"/> <xsl:with-param name="pStr" select="$pWord"/> </xsl:call-template> </xsl:variable> <xsl:copy-of select="vendor:node-set($vrtfResults)/closewords/*"/> </xsl:template> <xsl:template name="addLetterWords"> <xsl:param name="pWord"/> <xsl:variable name="vfaddLetter" select="document('')/*/addLetter:*[1]"/> <xsl:variable name="vrtf-accum"> <accum> <position>1</position> <word><xsl:value-of select="concat($pWord, ' ')"/></word> <closewords></closewords> </accum> </xsl:variable> <xsl:variable name="vaccum" select="vendor:node-set($vrtf-accum)/*"/> <xsl:variable name="vrtfResults"> <xsl:call-template name="str-foldl"> <xsl:with-param name="pFunc" select="$vfaddLetter"/> <xsl:with-param name="pA0" select="$vaccum"/> <xsl:with-param name="pStr" select="concat($pWord, ' ')"/> </xsl:call-template> </xsl:variable> <xsl:copy-of select="vendor:node-set($vrtfResults)/closewords/*"/> </xsl:template> <xsl:template match="delLetter:*"> <xsl:param name="arg1" select="/.."/> <!-- A0 --> <xsl:param name="arg2"/> <xsl:variable name="vPos" select="$arg1/position"/> <xsl:variable name="vWord" select="$arg1/word"/> <xsl:variable name="vCloseWords" select="$arg1/closewords"/> <xsl:variable name="vNewWord" select="concat(substring($vWord, 1, $vPos - 1), substring($vWord, $vPos + 1) )"/> <position><xsl:value-of select="$vPos + 1"/></position> <word><xsl:value-of select="$vWord"/></word> <closewords> <xsl:copy-of select="$vCloseWords/*"/> <word><xsl:value-of select="$vNewWord"/></word> </closewords> </xsl:template> <xsl:template match="addLetter:*"> <xsl:param name="arg1" select="/.."/> <!-- A0 --> <xsl:param name="arg2"/> <xsl:variable name="vPos" select="$arg1/position"/> <xsl:variable name="vfaddLetter" select="document('')/*/addLetterSingle:*[1]"/> <xsl:variable name="vrtfResults"> <xsl:call-template name="str-foldl"> <xsl:with-param name="pFunc" select="$vfaddLetter"/> <xsl:with-param name="pA0" select="$arg1"/> <xsl:with-param name="pStr" select="$validChars"/> </xsl:call-template> </xsl:variable> <xsl:variable name="vResults" select="vendor:node-set($vrtfResults)/*"/> <position><xsl:value-of select="$vPos + 1"/></position> <xsl:copy-of select="$vResults[not(self::position)]"/> </xsl:template> <xsl:template match="addLetterSingle:*"> <xsl:param name="arg1" select="/.."/> <!-- A0 --> <xsl:param name="arg2"/> <xsl:variable name="vPos" select="$arg1/position"/> <xsl:variable name="vWord" select="$arg1/word"/> <xsl:variable name="vCloseWords" select="$arg1/closewords"/> <xsl:variable name="vNewWord" select="concat(substring($vWord, 1, $vPos - 1), $arg2, substring($vWord, $vPos) )"/> <position><xsl:value-of select="$vPos"/></position> <word><xsl:value-of select="normalize-space($vWord)"/></word> <closewords> <xsl:copy-of select="$vCloseWords/*"/> <word><xsl:value-of select="$vNewWord"/></word> </closewords> </xsl:template> <xsl:template match="repLetter:*"> <xsl:param name="arg1" select="/.."/> <!-- A0 --> <xsl:param name="arg2"/> <xsl:variable name="vPos" select="$arg1/position"/> <xsl:variable name="vfrepLetter" select="document('')/*/repLetterSingle:*[1]"/> <xsl:variable name="vrtfResults"> <xsl:call-template name="str-foldl"> <xsl:with-param name="pFunc" select="$vfrepLetter"/> <xsl:with-param name="pA0" select="$arg1"/> <xsl:with-param name="pStr" select="translate($validChars, $arg2, '')"/> </xsl:call-template> </xsl:variable> <xsl:variable name="vResults" select="vendor:node-set($vrtfResults)/*"/> <position><xsl:value-of select="$vPos + 1"/></position> <xsl:copy-of select="$vResults[not(self::position)]"/> </xsl:template> <xsl:template match="repLetterSingle:*"> <xsl:param name="arg1" select="/.."/> <!-- A0 --> <xsl:param name="arg2"/> <xsl:variable name="vPos" select="$arg1/position"/> <xsl:variable name="vWord" select="$arg1/word"/> <xsl:variable name="vCloseWords" select="$arg1/closewords"/> <xsl:variable name="vNewWord" select="concat(substring($vWord, 1, $vPos - 1), $arg2, substring($vWord, $vPos + 1) )"/> <position><xsl:value-of select="$vPos"/></position> <word><xsl:value-of select="$vWord"/></word> <closewords> <xsl:copy-of select="$vCloseWords/*"/> <word><xsl:value-of select="$vNewWord"/></word> </closewords> </xsl:template> </xsl:stylesheet> Hope this helped. ===== Cheers, Dimitre Novatchev. http://fxsl.sourceforge.net/ -- the home of FXSL __________________________________________________ Do You Yahoo!? HotJobs - Search Thousands of New Jobs http://www.hotjobs.com XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|