[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Breaking paragraphs one linebreaks

Subject: Breaking paragraphs one linebreaks
From: "Manuel Souto Pico terminolator@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 9 May 2019 13:44:01 -0000
 Breaking paragraphs one linebreaks
Dear all,

I have a bilingual TMX file containing many tu elements like this,
containing full paragraphs:

<?xml version="1.0" encoding="UTF-8"?>
<tmx version="1.4">
   <header segtype="paragraph" adminlang="en"/>
   <body>
      <tu tuid="1">
         <tuv xml:lang="es">
            <seg>El PSOE ganarC-a en 10 de las 12 comunidades donde habrC!
elecciones autonC3micas el 26 de mayo, segC:n el C:ltimo barC3metro del CIS.
&lt;br&gt;Las excepciones serC-an Cantabria, donde el PRC, el partido de
Miguel Cngel Revilla, serC-a primera fuerza. &lt;br&gt;&lt;br&gt;Navarra
Suma, la coaliciC3n de PP, Ciudadanos y UPN, serC-a primera fuerza en la
comunidad foral.</seg>
         </tuv>
         <tuv xml:lang="uz">
            <seg>PSOE, MDHning eng so'nggi barometri bo'yicha 26 mayda
bo'lib o'tadigan mintaqaviy saylovlarda 12 ta jamoaning 10tasida g'olib
chiqadi.&lt;br&gt;Istisnolarga ko'ra, Cantabria, XXR, Migel Anxel Revilla
partiyasi birinchi kuch bo'ladi.&lt;br&gt;&lt;br&gt;"Navarra Suma", PP,
Cuudadanos va UPN koalitsiyasi mintaqaviy hamjamiyatning birinchi kuchi
bo'ladi.</seg>
         </tuv>
      </tu>
   </body>
</tmx>

As you can see there are a few (escaped) line break tags between sentences.

I would like to transform that into something like this, where every tu
element contains only sentences:

<?xml version="1.0" encoding="UTF-8"?>
<tmx version="1.4">
   <header segtype="paragraph" adminlang="en"/>
   <body>
      <tu tuid="1">
         <tuv xml:lang="es">
            <seg>El PSOE ganarC-a en 10 de las 12 comunidades donde habrC!
elecciones autonC3micas el 26 de mayo, segC:n el C:ltimo barC3metro del
CIS.</seg>
         </tuv>
         <tuv xml:lang="uz">
            <seg>PSOE, MDHning eng so'nggi barometri bo'yicha 26 mayda
bo'lib o'tadigan mintaqaviy saylovlarda 12 ta jamoaning 10tasida g'olib
chiqadi.</seg>
         </tuv>
      </tu>
      <tu tuid="2">
         <tuv xml:lang="es">
            <seg>Las excepciones serC-an Cantabria, donde el PRC, el partido
de Miguel Cngel Revilla, serC-a primera fuerza. </seg>
         </tuv>
         <tuv xml:lang="uz">
            <seg>Istisnolarga ko'ra, Cantabria, XXR, Migel Anxel Revilla
partiyasi birinchi kuch bo'ladi.</seg>
         </tuv>
      </tu>
      <tu tuid="3">
         <tuv xml:lang="es">
            <seg>Navarra Suma, la coaliciC3n de PP, Ciudadanos y UPN, serC-a
primera fuerza en la comunidad foral.</seg>
         </tuv>
         <tuv xml:lang="uz">
            <seg>"Navarra Suma", PP, Cuudadanos va UPN koalitsiyasi
mintaqaviy hamjamiyatning birinchi kuchi bo'ladi.</seg>
         </tuv>
      </tu>
   </body>
</tmx>

Do you think I can use XSLT to do this more or less easily?

I wrote a few XSLT stylesheets years ago but I'm far from being a savvy
user.

Thanks in advance for any tips.

Cheers, Manuel

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.