merging glossaries with synonyms and duplicate entries

Hi. 

I have two multilingual termbases (TB) with the same structure. Some terms (entry ID) contain synonyms. Duplicate terms are present (homographs).

Now I would like to merge the two glossaries and the problem arises. Let's say I want to import TB 2 (.xml file) into TB 1:

Case 1) Term is present in TB 2 but not TB 1: the term is added to TB with a new entry ID.

Case 2) Term is present in both TBs: information is merged. If information for that term in both TB 1 and TB 2 is exactly the same, no changes occurr, because no new info is available.

Case 3) Term is present in both TBs, but has multiple occurrences (homographs, separate entries) in TB 1. Chances are that the term is merged (matched) with the wrong homograph.

Now I would need a way to control what is happening during the merging process. Ideally, when Multiterm cannot make up its mind about the correct match, it should ask me what to do or skip the entry and flag it for me to do it manually at a later stage.

I can imagine this problem to be a common one, yet I haven't yet found a solution.

Thanks for any help.

emoji
Parents
  • Probability score: 0.7  

    Question improvement suggestion: Provide specific examples of the terms and their structures in both termbases to clarify the merging process.  

    In the meantime, based on the information you have provided, here is a suggested answer to your question:  

    Hello!  

    Merging termbases with synonyms and duplicate entries can indeed be challenging, especially when dealing with homographs. Here’s a step-by-step approach to help you manage the merging process effectively:

    **Step 1: Prepare Your Termbases**  

    Before merging, ensure that both termbases (TB 1 and TB 2) are well-structured and that you have a clear understanding of the terms and their synonyms in each.

    **Step 2: Use MultiTerm for Importing**  

    Import TB 2 into TB 1 using MultiTerm Desktop. When setting up the import, you can choose how to handle entries with the same ID:

    - **Ignore**: Skip the entry if it already exists.

    - **Merge**: Combine the information from both entries.

    - **Overwrite**: Replace the existing entry with the new one.

    **Step 3: Handle Duplicate Terms**  

    To manage cases where terms are present in both TBs:

    - If the information is identical, no changes will occur.

    - If there are multiple occurrences (homographs), you may need to manually review these entries to ensure they are merged correctly.

    **Step 4: Manual Review for Homographs**  

    For terms that are homographs, consider implementing a manual review process:

    - If MultiTerm cannot determine the correct match, it should ideally prompt you to decide how to proceed. If this feature is not available, you may need to manually check these entries after the import.

    **Step 5: Utilize Duplicate Filters**  

    After merging, use the Duplicate Filters option in MultiTerm to identify and manage any remaining duplicates. This will help you clean up your termbase and ensure that each term is represented accurately.

    **Step 6: Finalize and Save**  

    Once you have reviewed and merged the entries, save your changes. It’s also a good practice to back up your termbases before making significant changes.

    By following these steps, you should be able to merge your termbases while maintaining control over the process. If you encounter specific issues during the merging, feel free to provide more details, and I’d be happy to assist further!  

    I hope this answer addressed your question. If not, please reject the answer, add the question improvement suggestion information, and click the RWS Community AI checkbox on the right to try again.  

    RWS Community AI

    emoji
Reply
  • Probability score: 0.7  

    Question improvement suggestion: Provide specific examples of the terms and their structures in both termbases to clarify the merging process.  

    In the meantime, based on the information you have provided, here is a suggested answer to your question:  

    Hello!  

    Merging termbases with synonyms and duplicate entries can indeed be challenging, especially when dealing with homographs. Here’s a step-by-step approach to help you manage the merging process effectively:

    **Step 1: Prepare Your Termbases**  

    Before merging, ensure that both termbases (TB 1 and TB 2) are well-structured and that you have a clear understanding of the terms and their synonyms in each.

    **Step 2: Use MultiTerm for Importing**  

    Import TB 2 into TB 1 using MultiTerm Desktop. When setting up the import, you can choose how to handle entries with the same ID:

    - **Ignore**: Skip the entry if it already exists.

    - **Merge**: Combine the information from both entries.

    - **Overwrite**: Replace the existing entry with the new one.

    **Step 3: Handle Duplicate Terms**  

    To manage cases where terms are present in both TBs:

    - If the information is identical, no changes will occur.

    - If there are multiple occurrences (homographs), you may need to manually review these entries to ensure they are merged correctly.

    **Step 4: Manual Review for Homographs**  

    For terms that are homographs, consider implementing a manual review process:

    - If MultiTerm cannot determine the correct match, it should ideally prompt you to decide how to proceed. If this feature is not available, you may need to manually check these entries after the import.

    **Step 5: Utilize Duplicate Filters**  

    After merging, use the Duplicate Filters option in MultiTerm to identify and manage any remaining duplicates. This will help you clean up your termbase and ensure that each term is represented accurately.

    **Step 6: Finalize and Save**  

    Once you have reviewed and merged the entries, save your changes. It’s also a good practice to back up your termbases before making significant changes.

    By following these steps, you should be able to merge your termbases while maintaining control over the process. If you encounter specific issues during the merging, feel free to provide more details, and I’d be happy to assist further!  

    I hope this answer addressed your question. If not, please reject the answer, add the question improvement suggestion information, and click the RWS Community AI checkbox on the right to try again.  

    RWS Community AI

    emoji
Children
No Data