Alignment of 2 translation memories with one common language

Hello,

is there anyway to align two different TMs with one common language to get a third TM?

I have a TM with language pair English > French and a TM with language pair French > German (with the same content, as files had been translated from English to French and then from French to German). Now, I want to align these two TMs, as I need translations of similar documents from English to German now, and therefore, I'd like to have a TM with language pair English > German with the content from the two before mentioned TMs (English>French and French>German).

Does anyone know if that's possible anyhow?

Many thanks in advance.

emoji
Parents Reply Children
  •  

    I hope this helps and keen to see if anyone has other ideas

    Indeed a smart solution.  But given the use of AI these days I thought I'd try a different approach for fun.  It didn't take very long and you may be interested.

     

    Does anyone know if that's possible anyhow?

    Here's a way using the concept of multilingual TMX files.  I created a Python script with the help of ChatGPT that can take the English to French TMX, and the French to German TMX and merge them to create a multilingual TMX with all three languages in there.  Then I can import that TMX into a French to German SDLTM and it will populate with English to German.

    Here's the script: 

    import xml.etree.ElementTree as ET
    
    def merge_tmx(eng_fr_file, fr_de_file, output_file):
        # Define the namespaces (if any other namespaces are used, add them here)
        namespaces = {
            'xml': 'http://www.w3.org/XML/1998/namespace',
        }
        
        # Register the namespace
        ET.register_namespace('xml', namespaces['xml'])
        
        # Parse the English-French TMX file
        tree_eng_fr = ET.parse(eng_fr_file)
        root_eng_fr = tree_eng_fr.getroot()
    
        # Parse the French-German TMX file
        tree_fr_de = ET.parse(fr_de_file)
        root_fr_de = tree_fr_de.getroot()
    
        # Create a dictionary to hold French to German translations
        fr_de_dict = {}
        for tu in root_fr_de.find('body'):
            french_seg = tu.find(f"tuv[@xml:lang='fr-FR']", namespaces).find('seg').text
            german_seg = tu.find(f"tuv[@xml:lang='de-DE']", namespaces).find('seg').text
            fr_de_dict[french_seg] = german_seg
    
        # Iterate through the English-French TMX and add German where the French matches
        for tu in root_eng_fr.find('body'):
            french_seg = tu.find(f"tuv[@xml:lang='fr-FR']", namespaces).find('seg').text
            if french_seg in fr_de_dict:
                # If the French segment matches, add the corresponding German segment
                german_tuv = ET.Element('tuv')
                german_tuv.set(f"{{{namespaces['xml']}}}lang", 'de-DE')  # Corrected way to add namespace
                german_seg = ET.SubElement(german_tuv, 'seg')
                german_seg.text = fr_de_dict[french_seg]
                tu.append(german_tuv)
    
        # Write the merged TMX to a new file
        tree_eng_fr.write(output_file, encoding='utf-8', xml_declaration=True)
    
    # Prompt for file names
    eng_fr_file = input("Enter the path of the English-French TMX file: ")
    fr_de_file = input("Enter the path of the French-German TMX file: ")
    output_file = "en-GB_fr-FR_de-DE.tmx"
    
    merge_tmx(eng_fr_file, fr_de_file, output_file)
    print(f"Multilingual TMX file created: {output_file}")

    And here's a video explaining how to use it:

    Paul Filkin | RWS

    Design your own training!
    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
  • Many thanks for your reply. I gave it a first try, but the export to Excel took very long, as it's a large TM, and I had to cancel it as I could not continue working while the export was working. I might give it another try over night.

    Thanks!

    emoji
  • Many thanks for your reply as well. I'll definitely give it a try, thanks.

    emoji
  •  

    I'll definitely give it a try

    You should... this way you will also retain the document structure for better context.  And it's fast.

    Paul Filkin | RWS

    Design your own training!
    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
  • Thanks a lot for this, I tried and it was indeed very fast. The aligned TM is quite large, but at a first glance, the results look quite good. Thanks for trying this new technology :)

    emoji