Segmentation question

Question

I'm trying to force segmentation in xlz files which contain a lot of strings like this one: 
 ...Joseph Lubin.[5] In 2014, 
 where Trados doesn't seem to want to segment at the fullstop. I tried a pair of before and after segmentation rules like this: 
 . .$$*$$ 
 .$$*$$ . 
 but nothing seems to happen. Clearly I'm doing something wrong - again :)

Paul · Accepted Answer

Michael Bauer 
 Michael Bauer said: But it's greyed out 
 Simply because you have not enabled the "Embedded Content" which is what this panel is all about. Also make sure you check "Extract in all paragraphs". 
 Michael Bauer said: that eye-watering thing above 
 Full Pattern: (?<=\.)$$\d+$$ 
 This pattern matches text like [123] , but only if it is immediately preceded by a full stop ( . ) . 
 (?<=\.)

This is a positive lookbehind .

It asserts that what immediately precedes the match is a literal full stop . .

The (?<=...) part is a zero-width assertion &mdash; it checks the condition, but doesn&rsquo;t include it in the match.

So: it matches only if the preceding character is a full stop , but the full stop is not included in the match.

$$

Matches a literal opening square bracket [ .

Square brackets have a special meaning in regex (they define character classes), so to match a literal bracket, it must be escaped with a backslash ( \ ).

\d+

\d matches any digit (equivalent to [0-9] ).

+ means one or more .

Together, \d+ matches any number with at least one digit &mdash; e.g. 1 , 42 , 1000 .

$$

Matches a literal closing square bracket ] , again escaped because ] also has special meaning in regex.

Paul · Answer

Michael Bauer Thanks... I know what it is, but this is most likely why you have a problem. XLIFF is resegmentable if it only contains source text. Once translations are added, the file becomes fixed-segmentation and segmentation changes are no longer possible without affecting the target content. Most CAT tools that respect XLIFF will not resegment a bilingual XLIFF. For example:

One of the key figures was Vitalik Buterin.[3] In 2013,

The project gained momentum with the help of Gavin Wood.[7] In 2015,

Among the early contributors was Charles Hoskinson.[2] In 2016,

Leadership soon included Aya Miyaguchi.[4] In 2018,

Another notable participant was Elizabeth Stark.[6] In 2017,

If I preview this file, a monolingual XLIFF, I get this: I can use a simple rule on the filetype to make the [nr] an excluded tag (which is probably what will help you): (?<=\.)$$\d+$$ This gets me: But if I try it with a bilingual:

One of the key figures was Vitalik Buterin.[3] In 2013, Eine der Schlüsselfiguren war Vitalik Buterin.[3] Im Jahr 2013,

The project gained momentum with the help of Gavin Wood.[7] In 2015, Das Projekt gewann mit Hilfe von Gavin Wood an Dynamik.[7] Im Jahr 2015,

Among the early contributors was Charles Hoskinson.[2] In 2016, Zu den frühen Mitwirkenden gehörte Charles Hoskinson.[2] Im Jahr 2016,

Leadership soon included Aya Miyaguchi.[4] In 2018, Zur Führung gehörte bald Aya Miyaguchi.[4] Im Jahr 2018,

Another notable participant was Elizabeth Stark.[6] In 2017, Eine weitere bemerkenswerte Teilnehmerin war Elizabeth Stark.[6] Im Jahr 2017,

I'll get this: This is because the segmentation takes place on the source, so if the target is already populated Studio doesn't know what to do with it, so refuses the segmentation.

Trados Studio > 1. Trados Studio

Segmentation question

Top Replies

Full Pattern: `(?<=\.)\[\d+\]`

`(?<=\.)`

`\[`

`\d+`

`\]`

Trados Studio > 1. Trados Studio

Segmentation question

Top Replies

Full Pattern: (?<=\.)\[\d+\]

(?<=\.)

\[

\d+

\]

Full Pattern: `(?<=\.)\[\d+\]`

`(?<=\.)`

`\[`

`\d+`

`\]`