MultiTerm 2021 Extract - how to fix gibberish for Latvian term extraction projects?

Hey!

Somehow MultiTerm 2021 Extract does not decode Latvian language code (but it tells in the description it does) and I get gibberish results:

Screenshot of Trados Studio MultiTerm 2021 Extract showing a list of Latvian terms with incorrect encoding resulting in unreadable text.

Is there any way how to fix this? As I'm using this tool to extract the terms, but this issue makes it useless.

Please let me know.

Kind regards

Kaspars Rutkis



Generated Image Alt-Text
[edited by: Trados AI at 2:09 PM (GMT 0) on 5 Mar 2024]
emoji
Parents Reply
  • Thanks for sending the file over.  The file text you sent is UTF-8, and I can reproduce your problem with your file.  But as this is a text file it doesn't contain any instructions for the processor in MultiTerm Extract to know that this is UTF-8 and so the characters get corrupted.  So, I added a Byte Order Mark to the file and now it extracts correctly:

    Screenshot of Trados Studio showing a list of terms with scores and domains, some terms are in a non-Latin script, possibly corrupted due to encoding issues.

    I'll send you my file by email so you can test it.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 2:09 PM (GMT 0) on 5 Mar 2024]
Children