Pre-translation does not work

In a localisation project  we were given reference material which I converted into a TM. We were first given only missing sentences to translate. After translation I created a project TM.

When creating a project with the complete file to localise, only a very small percentage (~0,1 %) was pre-translated.

I had checked TM contents previously (they are organised in such a way, that normally we should get only Context matches), and TMs are activated.

What could be wrong?

Parents
  • Hi ,

    How did you create the TM? Could it be that there was a penalty being applied which reduced your matches to 99% and therefore missed the default 100% for pre-translation?

    Regards

    Paul

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Dear Paul,
    I found out the main reason: I did not see that I used the wrong file (target instead of source).
    However analysis is still not satisfying enough.
    We received resx files as source and reference target files (in which not translated strings were simply empty).
    I created a TM converting the source and target files into Xmls, then creating a bilingual Excel file which I opened in Studio with a new TM, that I updated (parameter : replace existing segments not activated)
    When creating the project with the complete resx, I added this TM and the project TM (only previously empty segments) to make sure everything would be in the right order.
  • Dear Beate,

    It's going to be tough to give you a precise answer without seeing your files but I can see at least one potential problem with your workflow. If there are any tags in the resx file then converting the way you do could result in the tags becoming translatable text by the time they are in the Excel file, or not there at all. This would obviously result in a loss of leverage when running the resx files through Studio against your TM.

    Why didn't you just align the resx files so the correct filetypes were used for your reference TM?

    Regards

    Paul

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Aligning turned out to be less effective and much more time-consuming than the conversion method (which I was able to check rather quickly).
    There are no tags at all since there is only plain text.
    What files would you need? How can I send them to you? (TMs and bilingual --> 14 MB)
  • No... just send me the following:

    1. source resx and target resx for a pair of files you know are not giving you the expected match
    2. your conversion for the same files to excel

    That's it. You can email them to pfilkin@sdl.com

    Regards

    Paul

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Thanks for the file Beate... I can see why you won't use Alignment with files like these, very painful indeed!  So this is what I did.

    First I did use Alignment, and I ran an analysis on the file using the aligned TM and the alignment penalty set to zero.  I get this:

    I then did the same thing after using the Glossary Converter to convert your Excel file to TMX and then upgrading it.  I get this:

    Slightly different, but still not the result you wanted and the differences will be mostly accounted for by placeables.  There are also things that should be tags in these files but I did not prepare them at all so everything is translatable text and probably a like for like with your workflow.

    So, why are these not all 100%?  Well, the first thing is easy... duplicate translation penalties.  For example:

    This would account for so many 99% matches throughout the file.  Then you have things that are linked to segmentation like this:

    So you only have a 69% match against the excel based TM because Studio has segmented the text in the resx in a different way to your Excel sheet.  You can see this as #71 is separate to #72 and yet the first match in the TM includes the whole thing.  If I use the aligned TM I see this:

    You might not have exactly the same result as you used the bilingual Excel filtype to create your TM... although if I check it using the Bilingual Excel filetype I see this which is to be expected as it will use the cells for segmentation which is different to using the resx natively:

    Anyway... I hope I gave you enough food for thought and you can see why you are not going to get the Context Matches all the way through the file?

    Regards

    Paul

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

Reply
  • Thanks for the file Beate... I can see why you won't use Alignment with files like these, very painful indeed!  So this is what I did.

    First I did use Alignment, and I ran an analysis on the file using the aligned TM and the alignment penalty set to zero.  I get this:

    I then did the same thing after using the Glossary Converter to convert your Excel file to TMX and then upgrading it.  I get this:

    Slightly different, but still not the result you wanted and the differences will be mostly accounted for by placeables.  There are also things that should be tags in these files but I did not prepare them at all so everything is translatable text and probably a like for like with your workflow.

    So, why are these not all 100%?  Well, the first thing is easy... duplicate translation penalties.  For example:

    This would account for so many 99% matches throughout the file.  Then you have things that are linked to segmentation like this:

    So you only have a 69% match against the excel based TM because Studio has segmented the text in the resx in a different way to your Excel sheet.  You can see this as #71 is separate to #72 and yet the first match in the TM includes the whole thing.  If I use the aligned TM I see this:

    You might not have exactly the same result as you used the bilingual Excel filtype to create your TM... although if I check it using the Bilingual Excel filetype I see this which is to be expected as it will use the cells for segmentation which is different to using the resx natively:

    Anyway... I hope I gave you enough food for thought and you can see why you are not going to get the Context Matches all the way through the file?

    Regards

    Paul

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

Children