XPath engine and XML parser of Trados 2015 SR 3

Hi I have problems with XML and XPath in Trados 2015 SR3. Questions at the end.

Below the line at the end of this post you see a sample XML file. To test this file, copy the text and paste it into an text-editor and save it as UTF-8, then follow these steps:

  1. Create a new Trados project.
  2. At the step 'Insert project files' press 'File types...'
  3. Press 'New...'
  4. Select 'XML (Embedded Content)', press ok.
  5. For file type identifier enter 'FastTranslator', press Next.
  6. Creater XML based on default, press Next.
  7. Press 'Add'
  8. Rule type XPath, '//*' (without quotes), Not translatable, ok.
  9. Press 'Add'
  10. Rule type XPath, '//Text' (without quotes), Always translatable, ok.
  11. Next.
  12. Any root element, Add, 'FastTranslator' (without quotes), ok.
  13. Finish.
  14. Ok
  15. Press 'Add files' and load the test file.

Now carry out the 6 steps mentioned in the file here below, for each step it is the '//Test' expression of the filter here above that changes. Some XPath expressions are honored by Trados. Some not. Some are honored by Trados when they should fail.

  1. I would like to see Trados XML/XPath engine to comply with the XMP/XPath standards (any version). Is this planned? If so, when?
  2. Can it be so that I need to change some settings to make use of standard compliant XPath expressions?
  3. It is not convenient that the XML parser expose CDATA syntax to the XPath search engine. Can this be fixed?
  4. Do you have a specification of the parts of the XPath engine that you DO implement so that I know what to expect?

Regards,
Jens Malmgren

Tools Manager at FastTranslator.com


 

<?xml version="1.0" encoding="UTF-8" ?>
<!--
1.  This XPath expression: //Text
    Should give this result:
    Element='<Text>Text One</Text>'
    Element='<Text>Text Two</Text>'
    Element='<Text>Text Three</Text>'
    Element='<Text>Text a</Text>'
    Element='<Text>Text b</Text>'
    Element='<Text>Text c</Text>'
    Element='<Text>Text Four</Text>'
    Element='<Text>Test Five</Text>'
    Element='<Text>Test Six</Text>'
    Conclusion: Both Trados 2015 SR3 and www.freeformatter.com/xpath-tester.htm gives this result. It works!
   
2.  This epression: /FastTranslator/Item/Text
    Should give this result:
    Element='<Text>Text One</Text>'
    Element='<Text>Text Two</Text>'
    Element='<Text>Text Three</Text>'
    Element='<Text>Text a</Text>'
    Element='<Text>Text b</Text>'
    Element='<Text>Text c</Text>'
    Element='<Text>Text Four</Text>'
    Element='<Text>Test Five</Text>'
    Element='<Text>Test Six</Text>'
    Conclusion: Both Trados 2015 SR3 and www.freeformatter.com/xpath-tester.htm gives this result. It works!
   
3.  This expression: //Item[Identifier/starts-with(text(),'Test')]/Text
    Should give this result:
    Element='<Text>Text One</Text>'
    Element='<Text>Text Two</Text>'
    Element='<Text>Text Three</Text>'
    Element='<Text>Text Four</Text>'
    Element='<Text>Test Five</Text>'
    Element='<Text>Test Six</Text>'
    Conclusion: Trados 2015 SR3 gives nothing, it failed. www.freeformatter.com/xpath-tester.htm gives this result, it works.

4.  This expression: //Item[Identifier = 'Test 2']/Text
    Should give this result:
    Element='<Text>Text Two</Text>'
    Conclusion: Both Trados 2015 SR3 and www.freeformatter.com/xpath-tester.htm gives this result. It works!

5.  This expression: //Item[Identifier = 'Test 5']/Text
    Should give this result:
    Element='<Text>Test Five</Text>'
    Conclusion: Trados 2015 SR3 produced nothing, failed. www.freeformatter.com/xpath-tester.htm gives this result. It works!

6.  This expression: //Item[Identifier = '<![CDATA[Test 5]]>']/Text
    Should give this result:
    NO MATCH!
    Conclusion: Trados 2015 SR3 produced 'Test Five' wich is incorrect, failed.  www.freeformatter.com/xpath-tester.htm gives NO MATCH. It works.
-->
<FastTranslator>
    <Item>
        <Identifier>Test 1</Identifier>
        <Text>Text One</Text>
    </Item>
    <Item>
        <Identifier>Test 2</Identifier>
        <Text>Text Two</Text>
    </Item>
    <Item>
        <Identifier>Test 3</Identifier>
        <Text>Text Three</Text>
    </Item>
    <Item>
        <Identifier>_Test a</Identifier>
        <Text>Text a</Text>
    </Item>
    <Item>
        <Identifier>_Test b</Identifier>
        <Text>Text b</Text>
    </Item>
    <Item>
        <Identifier>_Test c</Identifier>
        <Text>Text c</Text>
    </Item>
    <Item>
        <Identifier><![CDATA[Test 4]]></Identifier>
        <Text>Text Four</Text>
    </Item>
    <Item>
        <Identifier><![CDATA[Test 5]]></Identifier>
        <Text><![CDATA[Test Five]]></Text>
    </Item>
    <Item>
        <Identifier>Test 6</Identifier>
        <Text><![CDATA[Test Six]]></Text>
    </Item>
</FastTranslator>

  • Nice tests...

    Unknown said:
    3.  This expression: //Item[Identifier/starts-with(text(),'Test')]/Text

    Try this instead:

    //Item[Identifier[starts-with(text(),'Test')]]/Text

    I think this is a better expression, I'm also not sure yours is correct even if it does work in your online tester.  I don't think it's right to use the predicate and "starts-with" after a forward slash... but I'm not expert enough to know for sure.

    Having said this Studio still doesn't give you the same result and I think it's because Studio sees the CDATA as a node rather than text.  I'm also not sure if this is intended or not.  Perhaps  can comment on that?  So Studio 2015 returns this:

    Text One
    Text Two
    Text Three
    Test Six

    Unknown said:
    5.  This expression: //Item[Identifier = 'Test 5']/Text

    Maybe try this as an alternative?

    //Item[contains(Identifier,'Test 5')]/Text

    Again, one for  to validate your expression.

    Unknown said:
    6.  This expression: //Item[Identifier = '<![CDATA[Test 5]]>']/Text

    Maybe related to same issue above in 3.

    So not very conclusive on my part and definitely needs input from Patrik.

    Regards

    Paul

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Unknown said:

    Try this instead:

    //Item[Identifier[starts-with(text(),'Test')]]/Text

    I think this is a better expression, I'm also not sure yours is correct even if it does work in your online tester.  I don't think it's right to use the predicate and "starts-with" after a forward slash... but I'm not expert enough to know for sure.

    Having said this Studio still doesn't give you the same result and I think it's because Studio sees the CDATA as a node rather than text.  I'm also not sure if this is intended or not.

    Yep, that's correct... XPath Visualizer also sees this expression as correct (and the original one as incorrect).

    And yes, CDATA is a special section, not text, i.e. the returned value should be the complete content of the Text element, not content of the CDATA section inside the element.

    We would need to add "/text()" at the end of the XPath to retrieve the inner text of the CDATA section (i.e. also the inner text of the Text element):

  • Unknown said:
    We would need to add "/text()" at the end of the XPath to retrieve the inner text of the CDATA section (i.e. also the inner text of the Text element)

    Thanks for that Evzen... very useful example of how to handle CDATA content.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • I will correct my XPath expressions. Found a plugin for Notepad++ called XPatherizer that will work well and also better point out incorrect expressions.

    The suggestion on using text() to avoid the CDATA issue sounds good, I will try that.

    Thanks a million for all valuable feedback!