Segmentation question

I'm trying to force segmentation in xlz files which contain a lot of strings like this one:

...Joseph Lubin.[5] In 2014, 

where Trados doesn't seem to want to segment at the fullstop. I tried a pair of before and after segmentation rules like this:

.    .\[*\]

.\[*\]   .

but nothing seems to happen. Clearly I'm doing something wrong - again :) 

emoji
Parents
  •  

    Did you really mean .XLZ files?

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
  •  

    Thanks... I know what it is, but this is most likely why you have a problem.  XLIFF is resegmentable if it only contains source text. Once translations are added, the file becomes fixed-segmentation and segmentation changes are no longer possible without affecting the target content.  Most CAT tools that respect XLIFF will not resegment a bilingual XLIFF.  For example:

    <?xml version='1.0' encoding='UTF-8'?>
    <xliff version="1.2" xmlns="urn:oasis:names:tc:xliff:document:1.2">
      <file source-language="en-GB" target-language="de-DE" datatype="plaintext" original="file.txt">
        <body>
          <trans-unit id="1">
            <source>One of the key figures was Vitalik Buterin.[3] In 2013,</source>
          </trans-unit>
          <trans-unit id="2">
            <source>The project gained momentum with the help of Gavin Wood.[7] In 2015,</source>
          </trans-unit>
          <trans-unit id="3">
            <source>Among the early contributors was Charles Hoskinson.[2] In 2016,</source>
          </trans-unit>
          <trans-unit id="4">
            <source>Leadership soon included Aya Miyaguchi.[4] In 2018,</source>
          </trans-unit>
          <trans-unit id="5">
            <source>Another notable participant was Elizabeth Stark.[6] In 2017,</source>
          </trans-unit>
        </body>
      </file>
    </xliff>
    

    If I preview this file, a monolingual XLIFF, I get this:

    Screenshot of Trados Studio preview showing unsegmented XLIFF file with English text. File named 'unsegmented.xliff' with segments numbered 1 to 5.

    I can use a simple rule on the filetype to make the [nr] an excluded tag (which is probably what will help you):

    (?<=\.)\[\d+\]

    This gets me:

    Screenshot of Trados Studio preview showing unsegmented XLIFF file with English text. Segments are displayed without numbers, file named 'unsegmented.xliff'.

    But if I try it with a bilingual:

    <?xml version='1.0' encoding='UTF-8'?>
    <xliff version="1.2" xmlns="urn:oasis:names:tc:xliff:document:1.2">
      <file source-language="en-GB" target-language="de-DE" datatype="plaintext" original="file.txt">
        <body>
          <trans-unit id="1">
            <source>One of the key figures was Vitalik Buterin.[3] In 2013,</source>
            <target>Eine der Schlüsselfiguren war Vitalik Buterin.[3] Im Jahr 2013,</target>
          </trans-unit>
          <trans-unit id="2">
            <source>The project gained momentum with the help of Gavin Wood.[7] In 2015,</source>
            <target>Das Projekt gewann mit Hilfe von Gavin Wood an Dynamik.[7] Im Jahr 2015,</target>
          </trans-unit>
          <trans-unit id="3">
            <source>Among the early contributors was Charles Hoskinson.[2] In 2016,</source>
            <target>Zu den frühen Mitwirkenden gehörte Charles Hoskinson.[2] Im Jahr 2016,</target>
          </trans-unit>
          <trans-unit id="4">
            <source>Leadership soon included Aya Miyaguchi.[4] In 2018,</source>
            <target>Zur Führung gehörte bald Aya Miyaguchi.[4] Im Jahr 2018,</target>
          </trans-unit>
          <trans-unit id="5">
            <source>Another notable participant was Elizabeth Stark.[6] In 2017,</source>
            <target>Eine weitere bemerkenswerte Teilnehmerin war Elizabeth Stark.[6] Im Jahr 2017,</target>
          </trans-unit>
        </body>
      </file>
    </xliff>
    

    I'll get this:

    Screenshot of Trados Studio preview showing segmented XLIFF file with English source and German target text. File named 'segmented.xliff' with segments numbered 1 to 5.

    This is because the segmentation takes place on the source, so if the target is already populated Studio doesn't know what to do with it, so refuses the segmentation.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji


    Generated Image Alt-Text
    [edited by: RWS Community AI at 10:25 PM (GMT 1) on 14 Apr 2025]
  • Thanks Paul - this huge batch happens to be monolingual, so that would do me for now. But which file type exactly do I add this exclusion to and how? Also, is it likely to mess up segmentation of other stuff?

    emoji
Reply Children