How to apply paragraph segmentation?

Hi community,

I read a couple of discussions on this topic, but still I have not been able to solve the following problem:

I have an XML file that often contains several sentences in one element, e.g.:

<element>This is a sentence. This is another sentence. Look at this: another sentence.</element>

When creating the Studio project from this file, I get 4 individual segments (separated after each full stop or colon). However, I can see that in the TM there are many segments that already contain all sentences of an element, which results in a high number of fuzzy matches where there could be 100% Matches. I assumed that changing the segmentation rules from sentence-based to paragraph-based would give me segments that contain all sentences in an element, but apparantly it's not working. What can I do to save me from either having to merge hundreds of segments or having to copy and paste hundreds of concordance matches?

Thanks for any hints!

Parents Reply
  • that's strange. Something has not worked correctly here.
    This should work this way:
    First: Change the rules of the TM. Only the rules of the source language are relevant. Save.
    Second: create a new project and run only the batch tasks "convert to translatable format" and "copy to target languages".
    It's also important, that the TMs on the top of the TM list are your 3 TMs. Studio applies the TM rules of the TM on the top of the list (language specific I guess).
    Third: open the xliff to check the segmentation.
    Fourth: perform the analysis.

    You can also send me your files if you like to. sebastien.desautel@rws-group.de
Children