Align Documents cannot align XLIFF 2.0 files

My client sent me 2 xliff files from Articulate. The one with Source and the one with Target. I need to create a TM from them for the new assignment.

I tried to align both files using Align function in Trados Studio but I got the error that XLIFF 2.0 is not supported. Then I created SDLFLIXX in a project and tried to align those files, but again it says that XLIFF 2.0 is not supported.

My Trados verion: Trados Studio 2022 SR2 - 17.2.9.18688

Please, let me know how to proceed. The files are confidential, so I can send only a sample.

Thank you!
Sotir

Translate

Rate translation

Suggest better translation

Moderator UI

Thread Subject & Description
Align Documents cannot align XLIFF 2.0 files My client sent me 2 xliff files from Articulate. The one with Source and the one with Target. I need to create a TM from them for the new assignment. I tried to align both files using Align function in Trados Studio but I got the error that XLIFF 2.0 is not supported. Then I created SDLFLIXX in a project and tried to align those files, but again it says that XLIFF 2.0 is not supported. My Trados verion: Trados Studio 2022 SR2 - 17.2.9.18688 Please, let me know how to proceed. The files are confidential, so I can send only a sample. Thank you! Sotir
Get AI Suggestion

AI Reply

Accept answer Reject Answer

Top Replies

Paul over 1 year ago in reply to Sotir Rangelov +2 verified

Sotir Rangelov ok - here's a revised script: from lxml import etree import os def pretty_print_element(elem, level=0): # Function to add indentation and newlines to an XML element, recursively…

Parents

0 Paul over 1 year ago

Sotir Rangelov

A sample would be good.

Paul Filkin | RWS Group

________________________
Design your own training!
You've done the courses and still need to go a little further, or still not clear?
Tell us what you need in our Community Solutions Hub
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Reject Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Sotir Rangelov over 1 year ago in reply to Paul

Paul

Thank you for your attention!

I am attachnig the sample files with 3 segments each:

https://we.tl/t-O3Dnays2VY

Meanwhile as we needed to start the project on Friday we did the following:
1. Create sdlxliff files from both source (A) and target (B).

2. Export for external review in bilingual DOCX.

3. Copy-paste the text from the B file into the target column of the A file.

4. Manually correct the missallignments in the DOCX file.

5. Import back the bilingual A file into Trados (it did not want to do it until we removed all tags in the target column and this was the pitfall of this process)

6. We updated a specifically created TM as a reference.

I will be happy to do the alignment in Trados next time. :)

So your help is highly appreaiated!

Best regards,

Sotir
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate

0 Paul over 1 year ago in reply to Sotir Rangelov

Sotir Rangelov

Thanks for the files. I don't know exactly why this won't work so I will log this with support and we can create a bug as needed. In the meantime, and in case it helps with some ideas going forward, I was playing around with OpenAI this evening and create a Python script that will sort this out. This is what I did:

opened the English source file in Studio as an en-bg project. Copied source to target and saved the target file.
run the script that asks for the bulgarian source file, then the en-bg xliff target I created (that contains only English in source and target)
the script compares the IDs and if they are the same it puts the Bulgarian target into the en-bg file and saves an updated XLIFF as a new file

The script is here in case you're interested:

from lxml import etree
import os

def pretty_print_element(elem, level=0):
    # Function to add indentation and newlines to an XML element, recursively for all its children
    i = "\n" + level*"  "
    if len(elem):
        if not elem.text or not elem.text.strip():
            elem.text = i + "  "
        if not elem.tail or not elem.tail.strip():
            elem.tail = i
        for child in elem:
            pretty_print_element(child, level+1)
        if not child.tail or not child.tail.strip():
            child.tail = i
    else:
        if level and (not elem.tail or not elem.tail.strip()):
            elem.tail = i

# User input for file paths
first_file_path = input('Enter the path to the first XLIFF file: ')
second_file_path = input('Enter the path to the second XLIFF file: ')
output_file_path = os.path.splitext(second_file_path)[0] + '_merged.xliff'

# Load the XML content of both files
first_tree = etree.parse(first_file_path)
second_tree = etree.parse(second_file_path)

# Define the XML namespace
ns = {'x': 'urn:oasis:names:tc:xliff:document:2.0'}

# Get the root of the XML files
first_root = first_tree.getroot()
second_root = second_tree.getroot()

# Iterate through each unit in the first file
for first_unit in first_root.xpath('//x:file/x:unit', namespaces=ns):
    unit_id = first_unit.get('id')
    # Find the corresponding unit in the second file
    second_unit = second_root.xpath(f'//x:file/x:unit[@id="{unit_id}"]', namespaces=ns)

    if second_unit:
        # Get the target node, or create one if it doesn't exist
        target_node = second_unit[0].xpath('.//x:segment/x:target', namespaces=ns)
        if not target_node:
            segment_node = second_unit[0].find('.//x:segment', ns)
            target_node = etree.SubElement(segment_node, f'{{{ns["x"]}}}target')
        else:
            target_node = target_node[0]
            # Remove any existing content in the target node
            target_node.clear()

        # Get the source node from the first unit
        source_node = first_unit.xpath('.//x:segment/x:source', namespaces=ns)[0]

        # Copy all content from the source node to the target node
        target_node.text = source_node.text
        for element in source_node:
            target_node.append(element)

# After updating the XML content but before writing it to a file
for element in second_tree.xpath('//x:unit/x:segment/x:target', namespaces=ns):
    pretty_print_element(element)

# Now write the updated and pretty-printed XML to a new file
second_tree.write(output_file_path, xml_declaration=True, encoding='UTF-8', pretty_print=True)

# Print a success message
print(f"The XLIFF files have been merged and saved as: {output_file_path}")

I ran it in the terminal of Visual Studio Code like this:

Screenshot showing Visual Studio Code and the running of the Python script.

Result and file was this:

<?xml version='1.0' encoding='UTF-8'?>
<xliff xmlns="urn:oasis:names:tc:xliff:document:2.0" srcLang="en-GB" trgLang="bg-BG" version="2.0" xml:space="preserve">
  <file canResegment="no" id="Anti">
    <unit canResegment="no" id="6mC1hpNdo2N.Name" type="Articulate:PlainText">
      <segment>
        <source>Main Course</source>
        <target>Основен курс</target></segment>
    </unit>
    <unit canResegment="no" id="6T9xVpFJD5Y" type="Articulate:DocumentState">
      <originalData>
        <data id="generic_1"><Style Justification="Center" /></data>
        <data id="span_2"><Style FontSize="20.9454517" FontIsBold="False" /></data>
      </originalData>
      <segment>
        <source>
          <pc id="block_0">
            <ph dataRef="generic_1" id="generic_1"/>
            <pc dataRefStart="span_2" id="span_2">Create your personalised training today!</pc>
          </pc>
        </source>
        <target>
  <pc id="block_0">
    <ph dataRef="generic_1" id="generic_1"/>
    <pc dataRefStart="span_2" id="span_2">Създайте Вашето персонализирано обучение днес!</pc>
  </pc>
</target>
</segment>
    </unit>
    <unit canResegment="no" id="5eP0zufn63h" type="Articulate:DocumentState">
      <originalData>
        <data id="generic_1"><Style /></data>
        <data id="span_2"><Style FontFamily="Text TF Book" FontSize="10.4727259" FontIsBold="True" FontIsItalic="False" ForegroundColor="lt1,00" LinkColor="lt1,00" /></data>
      </originalData>
      <segment>
        <source>
          <pc id="block_0">
            <ph dataRef="generic_1" id="generic_1"/>
            <pc dataRefStart="span_2" id="span_2">START</pc>
          </pc>
        </source>
        <target>
  <pc id="block_0">
    <ph dataRef="generic_1" id="generic_1"/>
    <pc dataRefStart="span_2" id="span_2">НАЧАЛО</pc>
  </pc>
</target>
</segment>
    </unit>
  </file>
</xliff>

Which opens in Studio like this (tags fully expanded):

Screenshot of the final updated XLIFF in Studio

So now I can update into a TM.

Interestingly after I gave up using Powershell as I could not quite get it right, it actually took about 10 mins to come up with the code and create the file. So quite aneat solution I think for when things are not working as expected in Studio... and if you have more files like this to do probably a lot faster and more accurate too seeing as XLIFF maps neatly this sort of process and the IDs are checked.

Paul Filkin | RWS Group

________________________
Design your own training!
You've done the courses and still need to go a little further, or still not clear?
Tell us what you need in our Community Solutions Hub

Documentation Survey: help us offer you better documentation! Translate

Reply

0 Paul over 1 year ago in reply to Sotir Rangelov

Sotir Rangelov

opened the English source file in Studio as an en-bg project. Copied source to target and saved the target file.
run the script that asks for the bulgarian source file, then the en-bg xliff target I created (that contains only English in source and target)
the script compares the IDs and if they are the same it puts the Bulgarian target into the en-bg file and saves an updated XLIFF as a new file

The script is here in case you're interested:

from lxml import etree
import os

def pretty_print_element(elem, level=0):
    # Function to add indentation and newlines to an XML element, recursively for all its children
    i = "\n" + level*"  "
    if len(elem):
        if not elem.text or not elem.text.strip():
            elem.text = i + "  "
        if not elem.tail or not elem.tail.strip():
            elem.tail = i
        for child in elem:
            pretty_print_element(child, level+1)
        if not child.tail or not child.tail.strip():
            child.tail = i
    else:
        if level and (not elem.tail or not elem.tail.strip()):
            elem.tail = i

# User input for file paths
first_file_path = input('Enter the path to the first XLIFF file: ')
second_file_path = input('Enter the path to the second XLIFF file: ')
output_file_path = os.path.splitext(second_file_path)[0] + '_merged.xliff'

# Load the XML content of both files
first_tree = etree.parse(first_file_path)
second_tree = etree.parse(second_file_path)

# Define the XML namespace
ns = {'x': 'urn:oasis:names:tc:xliff:document:2.0'}

# Get the root of the XML files
first_root = first_tree.getroot()
second_root = second_tree.getroot()

# Iterate through each unit in the first file
for first_unit in first_root.xpath('//x:file/x:unit', namespaces=ns):
    unit_id = first_unit.get('id')
    # Find the corresponding unit in the second file
    second_unit = second_root.xpath(f'//x:file/x:unit[@id="{unit_id}"]', namespaces=ns)

    if second_unit:
        # Get the target node, or create one if it doesn't exist
        target_node = second_unit[0].xpath('.//x:segment/x:target', namespaces=ns)
        if not target_node:
            segment_node = second_unit[0].find('.//x:segment', ns)
            target_node = etree.SubElement(segment_node, f'{{{ns["x"]}}}target')
        else:
            target_node = target_node[0]
            # Remove any existing content in the target node
            target_node.clear()

        # Get the source node from the first unit
        source_node = first_unit.xpath('.//x:segment/x:source', namespaces=ns)[0]

        # Copy all content from the source node to the target node
        target_node.text = source_node.text
        for element in source_node:
            target_node.append(element)

# After updating the XML content but before writing it to a file
for element in second_tree.xpath('//x:unit/x:segment/x:target', namespaces=ns):
    pretty_print_element(element)

# Now write the updated and pretty-printed XML to a new file
second_tree.write(output_file_path, xml_declaration=True, encoding='UTF-8', pretty_print=True)

# Print a success message
print(f"The XLIFF files have been merged and saved as: {output_file_path}")

I ran it in the terminal of Visual Studio Code like this:

Screenshot showing Visual Studio Code and the running of the Python script.

Result and file was this:

<?xml version='1.0' encoding='UTF-8'?>
<xliff xmlns="urn:oasis:names:tc:xliff:document:2.0" srcLang="en-GB" trgLang="bg-BG" version="2.0" xml:space="preserve">
  <file canResegment="no" id="Anti">
    <unit canResegment="no" id="6mC1hpNdo2N.Name" type="Articulate:PlainText">
      <segment>
        <source>Main Course</source>
        <target>Основен курс</target></segment>
    </unit>
    <unit canResegment="no" id="6T9xVpFJD5Y" type="Articulate:DocumentState">
      <originalData>
        <data id="generic_1"><Style Justification="Center" /></data>
        <data id="span_2"><Style FontSize="20.9454517" FontIsBold="False" /></data>
      </originalData>
      <segment>
        <source>
          <pc id="block_0">
            <ph dataRef="generic_1" id="generic_1"/>
            <pc dataRefStart="span_2" id="span_2">Create your personalised training today!</pc>
          </pc>
        </source>
        <target>
  <pc id="block_0">
    <ph dataRef="generic_1" id="generic_1"/>
    <pc dataRefStart="span_2" id="span_2">Създайте Вашето персонализирано обучение днес!</pc>
  </pc>
</target>
</segment>
    </unit>
    <unit canResegment="no" id="5eP0zufn63h" type="Articulate:DocumentState">
      <originalData>
        <data id="generic_1"><Style /></data>
        <data id="span_2"><Style FontFamily="Text TF Book" FontSize="10.4727259" FontIsBold="True" FontIsItalic="False" ForegroundColor="lt1,00" LinkColor="lt1,00" /></data>
      </originalData>
      <segment>
        <source>
          <pc id="block_0">
            <ph dataRef="generic_1" id="generic_1"/>
            <pc dataRefStart="span_2" id="span_2">START</pc>
          </pc>
        </source>
        <target>
  <pc id="block_0">
    <ph dataRef="generic_1" id="generic_1"/>
    <pc dataRefStart="span_2" id="span_2">НАЧАЛО</pc>
  </pc>
</target>
</segment>
    </unit>
  </file>
</xliff>

Which opens in Studio like this (tags fully expanded):

Screenshot of the final updated XLIFF in Studio

So now I can update into a TM.

Paul Filkin | RWS Group

________________________
Design your own training!
You've done the courses and still need to go a little further, or still not clear?
Tell us what you need in our Community Solutions Hub

Documentation Survey: help us offer you better documentation! Translate

Children

0 Sotir Rangelov over 1 year ago in reply to Paul

Thank you, Paul !

This solution will certainly help other users before Trados team checks it.

I will try to test it later this week.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Sotir Rangelov over 1 year ago in reply to Paul

Well, I tried but the script stopped at the first file and I do not know why:

I istalled VIsual Studio Code, then Python 3, then Ixml. I figured out that I need to put double slash in the path. Maybe I need something else?

Generated Image Alt-Text
[edited by: Trados AI at 1:29 PM (GMT 0) on 29 Feb 2024]
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Paul over 1 year ago in reply to Sotir Rangelov

Sotir Rangelov

I think you're trying to edit the script itself and all you actually did was cause the path to be printed out in the question that you put in the script. Just run it as I provided it and enter the paths when prompted in the terminal window. I would have recorded it last night but it was late and I needed to be quiet where I was ;-) So just run it like this:

Paul Filkin | RWS Group

________________________
Design your own training!
You've done the courses and still need to go a little further, or still not clear?
Tell us what you need in our Community Solutions Hub
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Reject Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Sotir Rangelov over 1 year ago in reply to Paul

Got it! Thank you! The script do the magic. Thank you for your time!

Now I have another issue though.

The sample file opens just fine in Trados, but the real one gets this error:

Is there any option I can send you the original file privately, if you would like to take a look on it?

Generated Image Alt-Text
[edited by: Trados AI at 1:29 PM (GMT 0) on 29 Feb 2024]
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Paul over 1 year ago in reply to Sotir Rangelov

Sotir Rangelov

You can send it to pfilkin at sdl dotcom and when I get a little time I can take a look and see if I can find the problem. I can't guarantee today or even this week... but I will take a look.

Paul Filkin | RWS Group

________________________
Design your own training!
You've done the courses and still need to go a little further, or still not clear?
Tell us what you need in our Community Solutions Hub
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Sotir Rangelov over 1 year ago in reply to Paul

Whenever you have time for this. It is fine for me, Paul . Thank you and have a nice day!
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate

+1 Paul over 1 year ago in reply to Sotir Rangelov

Sotir Rangelov

ok - here's a revised script:

from lxml import etree
import os

def pretty_print_element(elem, level=0):
    # Function to add indentation and newlines to an XML element, recursively for all its children
    i = "\n" + level*"  "
    if len(elem):
        if not elem.text or not elem.text.strip():
            elem.text = i + "  "
        if not elem.tail or not elem.tail.strip():
            elem.tail = i
        for child in elem:
            pretty_print_element(child, level+1)
        if not child.tail or not child.tail.strip():
            child.tail = i
    else:
        if level and (not elem.tail or not elem.tail.strip()):
            elem.tail = i

# User input for file paths
first_file_path = input('Enter the path to the first XLIFF file: ')
second_file_path = input('Enter the path to the second XLIFF file: ')
output_file_path = os.path.splitext(second_file_path)[0] + '_merged.xliff'

# Load the XML content of both files
first_tree = etree.parse(first_file_path)
second_tree = etree.parse(second_file_path)

# Define the XML namespace
ns = {'x': 'urn:oasis:names:tc:xliff:document:2.0'}

# Get the root of the XML files
first_root = first_tree.getroot()
second_root = second_tree.getroot()

# Iterate through each unit in the first file
for first_unit in first_root.xpath('//x:file/x:unit', namespaces=ns):
    unit_id = first_unit.get('id')
    # Find the corresponding unit in the second file
    second_unit = second_root.xpath(f'//x:file/x:unit[@id="{unit_id}"]', namespaces=ns)

    if second_unit:
        second_unit = second_unit[0]
        # Copy <originalData> section if it exists
        original_data = first_unit.find('.//x:originalData', ns)
        if original_data is not None:
            second_original_data = second_unit.find('.//x:originalData', ns)
            if second_original_data is None:
                # If <originalData> does not exist in the second unit, create it
                second_original_data = etree.SubElement(second_unit, f'{{{ns["x"]}}}originalData')
            # Copy all <data> elements
            for data in original_data:
                if second_original_data.find(f'.//x:data[@id="{data.get("id")}"]', ns) is None:
                    # Only copy <data> if an element with the same id doesn't already exist
                    second_original_data.append(data)

        # Get the target node, or create one if it doesn't exist
        target_node = second_unit.xpath('.//x:segment/x:target', namespaces=ns)
        if not target_node:
            segment_node = second_unit.find('.//x:segment', ns)
            target_node = etree.SubElement(segment_node, f'{{{ns["x"]}}}target')
        else:
            target_node = target_node[0]
            # Remove any existing content in the target node
            target_node.clear()

        # Get the source node from the first unit
        source_node = first_unit.xpath('.//x:segment/x:source', namespaces=ns)[0]

        # Copy all content from the source node to the target node
        target_node.text = source_node.text
        for element in source_node:
            target_node.append(element)

# After updating the XML content but before writing it to a file
for element in second_tree.xpath('//x:unit/x:segment/x:target', namespaces=ns):
    pretty_print_element(element)

# Now write the updated and pretty-printed XML to a new file
second_tree.write(output_file_path, xml_declaration=True, encoding='UTF-8', pretty_print=True)

# Print a success message
print(f"The XLIFF files have been merged and saved as: {output_file_path}")

The problem was related to the script not handling the <originalData> section and its <data> elements correctly and we ended up with that error. Now it seems to work fine.

Paul Filkin | RWS Group

________________________
Design your own training!
You've done the courses and still need to go a little further, or still not clear?
Tell us what you need in our Community Solutions Hub

Documentation Survey: help us offer you better documentation! Translate

0 Paul over 1 year ago in reply to Paul

Sotir Rangelov

I also got a workaround for this bug from the Support team this afternoon... very easy workaround:

https://gateway.sdl.com/apex/communityknowledge?articleName=000021850

So you have two mechanisms to solve this now. Although I must admit I'm partial to the Python solution myself ;-)

Paul Filkin | RWS Group

________________________
Design your own training!
You've done the courses and still need to go a little further, or still not clear?
Tell us what you need in our Community Solutions Hub
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Reject Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Sotir Rangelov over 1 year ago in reply to Paul

Perfect result, thank you!
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Sotir Rangelov over 1 year ago in reply to Paul

Will keep the workaround for next project and then will try it, too. Thank you!
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate

Trados Studio > 1. Trados Studio

Align Documents cannot align XLIFF 2.0 files

Top Replies