[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] [no subject]This framework depends on some unique features of DITA but it could be adapted to generate HTML directly rather than DITA. The transformation is implemented as a two-phase process: Phase 1: Generated a simplified form of the Word XML, which I call "simple word processing format". This captures the essential structure and style details of the original Word document while eliminating all of the hideous verbosity of the Office Open markup design. Phase 2: Transform the simple word processing doc into DITA. This relies on a separate style-to-tag mapping document that relates Word styles to DITA structures. This depends on heavily on for-each-group and the code is a bit gnarly--it grew rather organically and, while it works, I can't claim it reflects the best engineering approach. If I were to ever rewrite the code I'm sure I would make it much cleaner and clearer. This second phase could be replaced with a new HTML-generation phase, driven either by the existing style-to-tag map or by a new one or just by some static binding from styles to HTML markup (if such a thing is possible). The Phase 1 process is pretty stable--I only have to update it when some new Word feature requires support from a client. The code is in GitHub here: https://github.com/dita4publishers/org.dita4publishers.word2dita Cheers, Eliot ---- Eliot Kimber, Owner Contrext, LLC http://contrext.com On 5/28/16, 8:39 PM, "adam adam@xxxxxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: >hi > >I'm new to the list. My usual home is at the Collaborative Knowledge >Foundation: >http://coko.foundation/ > >So, I was poking around looking for any community/co-ordinated attempts >at creating some robust XSL transformations from docx to HTML. I'm aware >of TEI stylesheets and have had a good poke around in github and >elsewhere, but I'm looking at straight docx->html (sans TEI) and the few >stylesheet repos I find are not so well maintained. I am probably >missing some, so any recommendations for a thriving hub of energy around >this particular conversion is would be appreciated. > >However, what I'm really looking for is an active community, possibly >with its own list or web based presence where there is a community >effort to improve specific conversion types. Essentially. Im wondering >if this already exists for docx->html or if not, then are their similar >attempts I can learn from?....my inclination is to look for, or set up, >something that had a web based component for testing so that non-XSL >experts could also contribute through manual QA of results etc... > >Any thoughts or tips welcomed.... > >Adam > > > >-- > >--- >Adam Hyde >http://www.adamhyde.net/projects
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|