How do I export or convert a MultiTerm termbase into TBX-Basic?

I want to export or convert a MultiTerm termbase into a TBXv2 TBX-Basic file so that I can import it into Lingo.

As far as I can see, MultiTerm doesn't offer TBX-Basic export as one of the export types on the database admin page. I see TBX 2002 export, which I means TBXv2 but in the standard or default dialect, not the basic dialect. Lingo seems to confirm this, because when I try to import the TBX file into Lingo it displays the error message "Not a TBX Basic file. Lingo currently only supports TBX Basic."

Am I correct in that conclusion or is there a TBXv2 TBX-Basic export type that I somehow haven't seen? I've only just started using MultiTerm, so it's quite possible that I've missed something.

If MultiTerm doesn't offer TBX-Basic export out of the box, is there a plug-in available to extend it and give it this capability?

Failing that, are there any other apps or utilities to convert the MultiTerm termbase into TBXv2 TBX-Basic? The only one I've come across so far is Glossary Converter. This claims to support TBX-Basic but I can't get that to work. There's a file format radio button for TBX dialect with the three options of Core, Min, and Basic as I'd expect, but switching between these doesn't seem to make any difference to the output file. No matter which of the three dialects I choose, the .tbx file contents are identical. Is there anyone on here who has more experience using Glossary Converter and who could suggest what I might be doing wrong?

Any suggestions welcome!

Regards,

Bruce Officer

emoji
Parents
  •  

    Use the Glossary Converter, it has these options for TBX:

    Trados Studio Glossary Converter options showing TBX dialect choices: Core, Min, Basic, with Core selected. Options to resolve note field content, write history data, and use TBX 3 Mapping File are visible.

    If you have Trados Studio  2021 or 2022 you can download this tool via the (Missing Wiki Page) . 

    If you don't then you can also find it here: https://cerebus.de/glossaryconverter/index.html

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 2:11 PM (GMT 0) on 5 Mar 2024]
  • Thank you for replying, Paul, but (as I said) I've already tried the Glossary Converter. The TBX it generates is rejected by Lingo's import function with the error message "Not a TBX Basic file. Lingo currently only supports TBX Basic." It doesn't seem to make any difference which of the three TBX dialects I choose in in the Glossary Converter's Formats > TBX window - the TBX file output by Glossary Converter is identical in all three cases. I've verified this by opening them in Notepad++ and doing file compares. I've even been in contact with the developer of Glossary Converter. It seems that the TBX dialect radio buttons alter the available mappings when you are converting TBX back into other formats, but they DON'T change much when you're converting in the SDLTB -> TBX direction. The 'type' attribute in the root <martif> element, for example, remains <martif type="TBX" xml:lang="en"> in the TBX files output by Glossary Converter regardless of the TBX dialect you think you've selected. The developer's suggestion was to try manually changing this to <martif type="TBX-Basic" xml:lang="en"> in my text editor prior to importing it into Lingo. I'm going to try that, but it would be disappointing to have to settle for this manual process. I'm sure there must be other people who've had to transfer termbases between MultiTerm and Lingo and I'm hoping one of them has found a slicker solution.

    emoji
  •  

    The TBX it generates is rejected by Lingo's import function with the error message "Not a TBX Basic file. Lingo currently only supports TBX Basic.

    I think it would be interesting to learn more from the Lingo side what they believe a TBX basic is and why these files fail?  I know that the developer of the Glossary Converter worked directly with the TBX committee at their request to be able to handle these formats as expected. Do you have any information on this, or even a link to something they refer to in their documentation?

    I'll make the developer aware of this thread though as he might be interested to comment.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
  • I'll ask MadCap what criteria Lingo uses to determine if a file is valid TBX-Basic or not. I'm not sure how much useful info I'll get from them, though. I get the impression that MadCap consider Lingo a minor adjunct to their flagship Flare authoring environment and don't assign much resource to supporting or updating it. Lingo hasn't had a major update since 2016 if I understand correctly, whereas Flare is updated at least once a year.

    This isn't the first issue I've encountered passing translation files between Lingo and RWS products. When I used Lingo to bundle up a Flare project into XLIFF for export to a Trados-using LSP, Lingo wouldn't reimport the return bundle. Peeking inside the files with Notepad++ I saw that Trados had overwritten the segmentation tags with ones using a different syntax. Not that I'd blame Trados - the xml structure and segmentation tagging in the original Lingo XLIFF export was frankly a mess. Unfortunately Lingo didn't seem to like those changes. I only mention this example because it reinforces my assessment of Lingo as being OK if used in isolation but poor when it comes to compatibility with standard translation file types.

    emoji
  • The TBX options in the converter are for TBX v3 format only, and that's what the label should say, apologies for that UI bug. So when you choose TBX as the output format (as opposed to TBX V3) they have no effect. TBX V2 Basic is a flavour of TBX V2 that is currently not supported. If changing the label manually does make a difference, please let me know, it'll help in improving the converter export.   

    emoji
  • Thank you for pitching in!

    I have just tried taking the TBX2 output from Glossary Converter and changing the root element from <martif type="TBX" xml:lang="en"> to <martif type="TBX-Basic" xml:lang="en">. That seems to be enough to get Lingo to accept it's a TBX-Basic file and at least try to interpret the fields in it.

    The Lingo termbase import still fails, though. It imports concepts 1, 2, and 3 OK but fails on concept 4 with the less than helpful message "Import failed for term newc4 : Object reference not set to an instance of an object." I'm going to try troubleshooting by comparing the concepts to see which fields exist for concept 4 that aren't in concepts 1 to 3 then narrow down by commenting fields out until I can work out which fields Lingo objects to. Wish me luck!

    emoji
  • UPDATE:

    I've successfully imported the TBX file generated by Glossary Convertor into Lingo.

    After manually changing the root element from <martif type="TBX" xml:lang="en"> to <martif type="TBX-Basic" xml:lang="en"> the only other thing I needed to do was to delete all Chinese entries, up to and including the langSet tags. For some reason I've not yet sussed out, Lingo is OK with the way Japanese characters appear in the xml but can't handle the Chinese characters. Chinese isn't a requirement for me at the moment, so I can live with that.

    The Lingo termbase editor displays the imported terms in a table:

    Screenshot of Trados Studio Lingo termbase editor displaying a table with multilingual terms in German, English, French, Spanish, Japanese, and Portuguese. Japanese characters are visible but Chinese characters are not included.

    Below that is a form with fields for the other information associated at concept level and term level (images, part of speech info, etc.). All that seems to be blank. I'm not sure if that means there's some syntax issue in the TBX XML such that Lingo can't interpret these fields, or maybe the Lingo import function simply looks for concepts, languages, and terms, and ignores everything else. I'll need to dig a bit more.

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 2:11 PM (GMT 0) on 5 Mar 2024]
Reply
  • UPDATE:

    I've successfully imported the TBX file generated by Glossary Convertor into Lingo.

    After manually changing the root element from <martif type="TBX" xml:lang="en"> to <martif type="TBX-Basic" xml:lang="en"> the only other thing I needed to do was to delete all Chinese entries, up to and including the langSet tags. For some reason I've not yet sussed out, Lingo is OK with the way Japanese characters appear in the xml but can't handle the Chinese characters. Chinese isn't a requirement for me at the moment, so I can live with that.

    The Lingo termbase editor displays the imported terms in a table:

    Screenshot of Trados Studio Lingo termbase editor displaying a table with multilingual terms in German, English, French, Spanish, Japanese, and Portuguese. Japanese characters are visible but Chinese characters are not included.

    Below that is a form with fields for the other information associated at concept level and term level (images, part of speech info, etc.). All that seems to be blank. I'm not sure if that means there's some syntax issue in the TBX XML such that Lingo can't interpret these fields, or maybe the Lingo import function simply looks for concepts, languages, and terms, and ignores everything else. I'll need to dig a bit more.

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 2:11 PM (GMT 0) on 5 Mar 2024]
Children