Multiterm Converter 2021 Parsing Error

I am trying to convert a .tbx file to create a termbase. 

The .tbx file is from the Irish terminological database, available here https://www.tearma.ie/ioslodail/ 

I am getting this error:
SDL MultiTerm Convert dialog box showing file paths for input, output, termbase definition, and log files. An error message reads 'The conversion option could not be initialised properly. An error occurred while parsing EntityName. Line 3167, position 58.'


I can see other parsing errors on the forum, but nothing that is a help re Multiterm Converter 2021.

Can anybody advise?

TIA, Darán



Generated Image Alt-Text
[edited by: Trados AI at 2:00 PM (GMT 0) on 5 Mar 2024]
emoji
Parents
  • I downloaded the TBX from the website.  As noted the file does contain invalid XML.  So this is what I did:

    1. ran a couple of search & replace operations on the file to remove all the invalid XML and replaced with ' which is the correct way to represent the reserved character for an & symbol in XML.  As far as I can see there are no other reserved characters being used incorrectly
    2. I then converted the TBX (it's not OLIF) using the Glossary Converter rather than MultiTerm Convert because it's easier ;-)  There were a few entries in Latin (LA), Maori (MA), and old Irish (SGA), none of which are supported in Trados so I ignored these
    3. the entire TBX contains a large number of languages which makes it much too large for MultiTerm to convert and handle.  So as a proof it works I ignored all languages apart from English and Irish and this works... just takes so long to reorganise I lost patience and didn't wait till the end... but did get a good number in just to prove the concept:
      Trados Studio interface showing the TermBase Management window with a list of languages including English and Irish.

    Reality is that aside from the XML being invalid it's simply too large!

    https://multifarious.filkin.com/2019/06/29/how-do-you-eat-an-elephant/

    Also noting... "concepts" and "entries" are not the same thing ;-)

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 2:00 PM (GMT 0) on 5 Mar 2024]
Reply
  • I downloaded the TBX from the website.  As noted the file does contain invalid XML.  So this is what I did:

    1. ran a couple of search & replace operations on the file to remove all the invalid XML and replaced with ' which is the correct way to represent the reserved character for an & symbol in XML.  As far as I can see there are no other reserved characters being used incorrectly
    2. I then converted the TBX (it's not OLIF) using the Glossary Converter rather than MultiTerm Convert because it's easier ;-)  There were a few entries in Latin (LA), Maori (MA), and old Irish (SGA), none of which are supported in Trados so I ignored these
    3. the entire TBX contains a large number of languages which makes it much too large for MultiTerm to convert and handle.  So as a proof it works I ignored all languages apart from English and Irish and this works... just takes so long to reorganise I lost patience and didn't wait till the end... but did get a good number in just to prove the concept:
      Trados Studio interface showing the TermBase Management window with a list of languages including English and Irish.

    Reality is that aside from the XML being invalid it's simply too large!

    https://multifarious.filkin.com/2019/06/29/how-do-you-eat-an-elephant/

    Also noting... "concepts" and "entries" are not the same thing ;-)

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 2:00 PM (GMT 0) on 5 Mar 2024]
Children