Studio not auto-localizing tokens

Hi all,

The point here is that when I pre-translate with an empty TM a test file that contains some tokens such us figures, acronyms (three letter currencies) and dates only figures are auto-localized even though the settings are configured to recognize all tokens (including acronyms and dates) in both my TM settings and the auto-substitution settings of the target language pair. Is there a way to also auto-localize the tokens that I have mentioned?

The practical case would be a huge file that has a lot of repetitions. In this case, extracting an Unknown Segments file using an empty TM would be very useful to get rid of all those repetitions. The problem arise when Studio only extracts one of those TUs containing only tokens (such us dates or acronyms)  since the rest are treated as repetitions but then, if you try to pre-translate the source file with the translated Unknown Segments file all those tokens are not automatically localized and need manual fixing.

Many thanks in advance to any helpful idea!

Kind regards,

Carlos

Parents
  • How about an example file so we can see what you're working with?

    The practical case would be a huge file that has a lot of repetitions.

    I wouldn't use unknown segments, although you could.  Probably easier to simply filter on them, handle them all in one go, change status to translated and then lock them or hide them.

    Paul Filkin | RWS

    Design your own training!
    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Hi Paul,

    Thanks for your prompt answer!

    Attached you have the test files that I used for this. I have already extracted the Unknown Segments file, translated it and populated the TM with it. Please let me know with your findings!

    The reason why I use the Unknown Segments file:

    • When working with hundreds of even thousands of files this option provides a lot of efficiency, since you only have to work with one file.
    • When extracting Unknown Segments the file size also decreases significantly (also for package creation)
    • Studio also works more smoothly when dealing with just one file.

    Another option would be running two rounds of pre-translation of the source files. One with an empty TM so all the tokens are auto-localized and another one with the translated Unknown Segments file. However, I still face the same issue here, since not all the tokens are auto-localized even with the empty TM.

    The option of not recognizing tokens in the TM used to extract the Unknown Segments has been also considered, however,  the word count increases significantly since all the tokens would be included as common text and we would losing the opportunity of this auto-localization that Studio offers.

    What am I missing? Maybe some settings that I did not pay attention to?

    Thank you very much in advance

    Carlos

    SDL Unknown Segments Test files.zip

Reply
  • Hi Paul,

    Thanks for your prompt answer!

    Attached you have the test files that I used for this. I have already extracted the Unknown Segments file, translated it and populated the TM with it. Please let me know with your findings!

    The reason why I use the Unknown Segments file:

    • When working with hundreds of even thousands of files this option provides a lot of efficiency, since you only have to work with one file.
    • When extracting Unknown Segments the file size also decreases significantly (also for package creation)
    • Studio also works more smoothly when dealing with just one file.

    Another option would be running two rounds of pre-translation of the source files. One with an empty TM so all the tokens are auto-localized and another one with the translated Unknown Segments file. However, I still face the same issue here, since not all the tokens are auto-localized even with the empty TM.

    The option of not recognizing tokens in the TM used to extract the Unknown Segments has been also considered, however,  the word count increases significantly since all the tokens would be included as common text and we would losing the opportunity of this auto-localization that Studio offers.

    What am I missing? Maybe some settings that I did not pay attention to?

    Thank you very much in advance

    Carlos

    SDL Unknown Segments Test files.zip

Children