Hi,
Quite new to Trados Studio.
Haven't been able to find where I can add a segmentation rule so as to avoid wrong segmentation in source file.
Any help would be much appreciated.
Cheers
Roberto

Hi,
Quite new to Trados Studio.
Haven't been able to find where I can add a segmentation rule so as to avoid wrong segmentation in source file.
Any help would be much appreciated.
Cheers
Roberto
There are fundamentally two ways to drive segmentation, both of them require changes to be made before you create your project.
What are you trying to segment on?
Paul Filkin | RWS Group
________________________
Design your own training!
You've done the courses and still need to go a little further, or still not clear?
Tell us what you need in our Community Solutions Hub
I have a couple of cases where segmentation is not working properly in source file:
1) After abbreviation "etc."
I tried to delete "etc." from abbbreviations list in TM source language, but it's not working
2) After a single capital letter (example: X. or W.)
In both cases I think the best would be to add custom sentence based segmentation rules, something like
Before break
W
Break characters
. (a dot)
Which options should I tick here? (Check abb, ordinal and punctuation)
After break (using regular expression)
1 or 2 empty spaces followed by a capital letter
Cheers
It would help me give you a better answer if you provided the following:
I ask for this because the abbreviations pretty much work out of the box and as expected. So I need to know what your specific circumstances are.
Paul Filkin | RWS Group
________________________
Design your own training!
You've done the courses and still need to go a little further, or still not clear?
Tell us what you need in our Community Solutions Hub
Filetype is Microsoft Word (docx)
1) etc.
Example:
...a channel, a pathway, combinations of these, etc. The attachment portion 405 can include...
Another example:
a channel, a pathway, combinations of these, etc. While the example of the device or implant
In the first example, there are 2 empty spaces and in the second example only one.
I would need that the sentence breaks after etc. since just after that there's a new sentence starting with a capital letter.
I tried to delete etc. from the abbreviation list in the TM source language, but it's still not working....
2) W. or Z. or X.
Examples:
...is greater than the width W. The thickness T being greater than....
...to move in an outward direction Z. This movement of the flexible portion 1687 in the...
...2182 in the inward direction X. The arms 2182 are connected to each other....
I noticed that in the abbreviations list all capital letters followed by a dot are there by default. I tried to delete them in the TM source language, but it's still not working...
Cheers
ok... I used this sample file:
Then I did this:
I also attached my TM in case it helps:
Paul Filkin | RWS Group
________________________
Design your own training!
You've done the courses and still need to go a little further, or still not clear?
Tell us what you need in our Community Solutions Hub
Thanks a lot for your efforts!
So I created a new project, then I
1) Deleted both etc. entries in TM source language abbreviation list
2) Added a custom segmentation rule in TM source language, like this
\s\p{Lu}\. (Lu letters between curly brackets)
But when trying to open the sdlxliff file an error pops up
Thanks a lot for your efforts!
So I created a new project, then I
1) Deleted both etc. entries in TM source language abbreviation list
2) Added a custom segmentation rule in TM source language, like this
\s\p{Lu}\. (Lu letters between curly brackets)
But when trying to open the sdlxliff file an error pops up
Solved the issue regarding the "reference not set" error (I was storing my resources files in Dropbox, and I noticed a few warnings when Trados files were opened, so I decided to move all my resources files into a laptop local folder just to be sure!)
However the segmentation rule is not working....
1) etc.
...a pathway, combinations of these, etc. While the example of the device or implant...
2) W. X. and so on...
...to move in an outward direction Z. This movement of the flexible portion....
So, I tried to create a new project this time using your segment.docx file and everything worked flawlessly...
Why is it not working with my source file?
Managed to make it work!
However deleting the abbreviation for etc. and adding the custom segmentation rule, solved an issue but created a couple of new ones...
1) etc.
After etc. is not always necessary to break the sentence. It would be necessary only if after etc. there's an empty space followed by a capital letter.
I presume that the abbreviation etc. must be kept in the list and add a segmentation rule instead, right?
2) Break after single ending capital letter
The segmentation rule is working fine for single capital letters like W. Z. and so on. But after for example FIG. it breaks the sentence (I tried to add it into abbreviation list but it doesn't work either...).
Example:
as shown in FIG.
166) such that the paddle
In this case maybe it would be better to add some regex in the after break portion of segmentation rule, so as to break the sentence only if after a capital letter and a dot, there's an empty space followed by a capital letter!
Cheers
ok - this is why you could help yourself a lot more by providing a comprehensive and specific example file for anyone willing to help to work with. Not everyone will be prepared to spend time guessing what you need.
I created a new sample file:
I used the same TM I gave you earlier and added one exception to the full stop rule:
This resulted in this, which seems to behave as you want:
I did not need to make any changes to the abbreviations list.
If it still doesn't work for you please provide a small sample file containing the offending sentences.
Paul Filkin | RWS Group
________________________
Design your own training!
You've done the courses and still need to go a little further, or still not clear?
Tell us what you need in our Community Solutions Hub
It still doesn't work on my end...
Anyway here's a sample docx file with all use cases, including a description of what I need to achieve!
I have no idea why it doesn't work on your end. I suggest you refer to the video again to make sure you are applying this correctly and are always opening the file against the updated TM. I took your sample file, did a little more of the same we have already discussed and get this which I think is what you're after:
Here's my TM:
Only changes I made to accommodate your sample file were these exceptions to the full stop rule:
Paul Filkin | RWS Group
________________________
Design your own training!
You've done the courses and still need to go a little further, or still not clear?
Tell us what you need in our Community Solutions Hub