EntryID during conversion from TBX to XML + XDT

Hello,

currently, I'm working on the conversion of a TBX export from an "external" terminology database into the MultiTerm XML format to be able to import the entries into a MultiTerm database. I've tried "MultiTerm 2022 SR1 Convert" (17.1.2185) as well as the GlossayConverter (6.2.8543.33526).

Unfortunately, with both applications I'm facing the following problem and I don't know, why it behaves as it behaves.
During the conversion process, the values of the "<termEntry id="XYZ">" in the TBX file are written to "<descripGrp><descrip type="conceptId">XYZ</descrip></descripGrp>" in the XML. Additionally, in the XML file, at the beginning of each term entry, I get "<conceptGrp><concept>{ascending number}</concept>".

The TBX file starts with:
"<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE martif SYSTEM "TBXcoreStructV02.dtd">
<martif type="TBX" xml:lang="de-DE">
  <martifHeader>
    <fileDesc>
      <titleStmt>
        <title>Title</title>
      </titleStmt>
      <sourceDesc>
        <p></p>
      </sourceDesc>
    </fileDesc>
    <encodingDesc>
      <p type="XCSURI">www.ttt.org/.../p>
    </encodingDesc>
    <revisionDesc>
      <change>
        <p>2023-07-11 07:35:07 UTC</p>
      </change>
    </revisionDesc>
  </martifHeader>
  <text>
    <body>
      <termEntry id="10001">
      ..."

In my opinion, the tag "concept" has to contain the value from "<termEntry id=" of the TBX instead of a "self given" number. Otherwise, it might lead to doublicated entries in the MultiTerm database.

My question now is, why is the value of "<termEntry id=" in the TBX written to a "descripGrp" group instead of into the "concept" tag of the XML?

And second: What do I have to do, to be able to convert the TBX file into a MultiTerm XML file where the term ID from the TBX is used as the term ID in the MultiTerm XML?

Thank you very much in advance for any hint and your support. :)
Your help is very much appreciated.

Kind regards and have a nice and easy day.
Nils

emoji
Parents
  •  

    It's very hard for me to understand your problem without a better sample file (I don't know the product well enough).  So as an example I completed your file with a couple of terms:

    <?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE martif SYSTEM "TBXcoreStructV02.dtd">
    <martif type="TBX" xml:lang="de-DE">
      <martifHeader>
        <fileDesc>
          <titleStmt>
            <title>Title</title>
          </titleStmt>
          <sourceDesc>
            <p>Source Description</p>
          </sourceDesc>
        </fileDesc>
        <encodingDesc>
          <p type="XCSURI">www.ttt.org/.../</p>
        </encodingDesc>
        <revisionDesc>
          <change>
            <p>2023-07-11 07:35:07 UTC</p>
          </change>
        </revisionDesc>
      </martifHeader>
      <text>
        <body>
          <termEntry id="10001">
            <langSet xml:lang="de-DE">
              <tig>
                <term>Auto</term>
              </tig>
            </langSet>
            <langSet xml:lang="en-GB">
              <tig>
                <term>Car</term>
              </tig>
            </langSet>
          </termEntry>
          <termEntry id="10002">
            <langSet xml:lang="de-DE">
              <tig>
                <term>Fahrrad</term>
              </tig>
            </langSet>
            <langSet xml:lang="en-GB">
              <tig>
                <term>Bicycle</term>
              </tig>
            </langSet>
          </termEntry>
        </body>
      </text>
    </martif>
    

    Then tested with the Glossary Converter and also by using MultiTerm Convert.  I don't think either of them can "force" MultiTerm to use the conceptid in MultiTerm as I think it has it's own mechanism for this.  But at least the Glossary Converter can read the conceptid and add it as a new field at the entry level:

    Screenshot showing how the TBX conerts properly inside MultiTerm with the ConceptID of the TBX being used as a conceptid entry ID.

    MultiTerm Convert doesn't see this field at all.

    Is this what you are trying to achieve?  have MultiTerm use the same conceptid as the TBX?

    I think in this case the TradosAI reply looks pretty good!

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
  • Hello  ,

    thank you very much for your soon response. Slight smile

    Yes, I'm trying to achieve, that MultiTerm uses the conceptID as Term ID ("Entry Id"). Therefore, yes, the AI answer looks pretty good.

    Let me try to clarify my problem a little:
    My problem is, that some terms are "doublicated" and (therefore) have different properties/values during updating the MultiTerm database with the XML.

    One entry is:

    Trados Studio MultiTerm entry showing Entry Id 3640 with conceptId 13876 highlighted, indicating a potential duplicate issue.

    But - after the import of an updated TBX/XML -  the same term exists with a different "Entry ID", too. But in the originating TBX file, it's the same term, according to the conceptId:

    Trados Studio MultiTerm entry showing a different Entry Id 3657 with the same conceptId 13876 highlighted, suggesting a duplication error.

    I'd expected, that a term entry from the TBX file will ever be referenced as the one and the same term entry in the MultiTerm database as well. Not as possibly multiple entries in the XML/MultiTerm.

    Does this clarify my problem  a little deeper?

    Thank you very much for your support.

    Kind regards
    Nils

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 2:21 PM (GMT 0) on 5 Mar 2024]
Reply
  • Hello  ,

    thank you very much for your soon response. Slight smile

    Yes, I'm trying to achieve, that MultiTerm uses the conceptID as Term ID ("Entry Id"). Therefore, yes, the AI answer looks pretty good.

    Let me try to clarify my problem a little:
    My problem is, that some terms are "doublicated" and (therefore) have different properties/values during updating the MultiTerm database with the XML.

    One entry is:

    Trados Studio MultiTerm entry showing Entry Id 3640 with conceptId 13876 highlighted, indicating a potential duplicate issue.

    But - after the import of an updated TBX/XML -  the same term exists with a different "Entry ID", too. But in the originating TBX file, it's the same term, according to the conceptId:

    Trados Studio MultiTerm entry showing a different Entry Id 3657 with the same conceptId 13876 highlighted, suggesting a duplication error.

    I'd expected, that a term entry from the TBX file will ever be referenced as the one and the same term entry in the MultiTerm database as well. Not as possibly multiple entries in the XML/MultiTerm.

    Does this clarify my problem  a little deeper?

    Thank you very much for your support.

    Kind regards
    Nils

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 2:21 PM (GMT 0) on 5 Mar 2024]
Children