[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

hyphenator in xsl implementing LIANG's algorithm

Subject: hyphenator in xsl implementing LIANG's algorithm
From: Bruno Mascret <bmascret@xxxxxxx>
Date: Wed, 26 Aug 2009 22:27:30 +0200
 hyphenator in xsl implementing LIANG's algorithm
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello,

I have just finished the first version of an xsl hyphenator (not xsl-fo,
all in xsl 2.O) implementing Liang's algorithm (the one used in tex). (1)

It requires 2 files: the main code file (hyphenation.xsl) which contains
the hyphenate's function, and the pattern file (hyphen.xsl) containing
the sequence of patterns.

Files can be found here:
https://svn.liris.cnrs.fr/nat/trunk/xsl/hyphenation.xsl
http://liris.cnrs.fr/~bmascret/nat/xsl/hyphens.xsl (French rules) (2)

I also added a sample test file here:
http://liris.cnrs.fr/~bmascret/nat/xsl/testHyph.xml (3)

I personally used saxon 9 as xslt processor.

hyphenation.xsl has a boolean parameter (debug) which allows to switch
to debug mode detailing the hyphenation process.

Sample outputs:
$~> java -jar saxon9.jar -s:testHyph.xml -xsl:hyphenation.xsl

 bon-jour
 le
 mon-de

$~> java -jar saxon9.jar -s:testHyph.xml -xsl:hyphenation.xsl  debug=true

{word: bonjour} -------
 * 564: {pattern used: 1bo} {result:1b0o0n0j0o0u0r0}
 * 657: {pattern used: 1j} {result:b0o0n01j0o0u0r0}
after LIANG: b0o0n1j0o0u0r0 : bon-jour
{word: le} -------
 * 674: {pattern used: 1le} {result:1l0e0}
after LIANG: l0e0 : le
{word: monde} -------
 * 595: {pattern used: 1de} {result:m0o0n01d0e0}
 * 693: {pattern used: 1mo} {result:1m0o0n0d0e0}
after LIANG: m0o0n1d0e0 : mon-de

    bon-jour
    le
    mon-de

I hope that can help, otherwise if you have any suggestion or comment to
improve it, please let me know.
The comments are in French, but if you wish, they can be translated into
English.

Best regards,
Bruno Mascret

NAT's project: a free universal Braille translator
http://natbraille.free.fr

Footnotes:
(1) My goal was to use it in a more complicated way (Braille
transcription, http://natbraille.free.fr or
https://svn.liris.cnrs.fr/nat for source code)

(2) The pattern file is auto-generated from a compatible hyphenation
dictionary (tex, openoffice) thanks to a java code.
If you find it useful, you can use the following files
https://svn.liris.cnrs.fr/nat/trunk/outils/HyphenationToolkit.java
(model) and https://svn.liris.cnrs.fr/nat/trunk/ui/ConfDictCoup.java
(view), or ask for a standalone software if needed.
Non-French dictionaries can also be used.

(3) the hyphenate function can of course be used in a different xml
structure ;-)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkqVmrIACgkQaOubDsBUvbtw7gCfZPRkNSTecgBmJryZYnnsRgvO
wwoAnRG2TZwRbLWvtJqK9b7wCZxfucXh
=dRjN
-----END PGP SIGNATURE-----

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2011 All Rights Reserved.