XPATH in Studio 2017/2019: Issues getting substring-after and substring-before to work

Hi all,

The custom version of the Umbraco CMS we use at AMnesty features a somewhat strange "macro" system to deal with special text blocks such as pull-out quotes, embedded tweets, etc. Recently, our web team introduced a new kind of action button macro and I'm having some issues parsing it in Studio 2017 (I've also tried Studio 2019 with similar results).

The XML looks like this, with the bits I need to extract for translation highlighted in yellow:

<UMBRACO_MACRO macroAlias="Button" Url="[{&quot;id&quot;:66296,&quot;name&quot;:&quot;Read more&quot;,&quot;url&quot;:&quot;/umbraco/latest/campaigns/2020/03/covid-19/&quot;,&quot;icon&quot;:&quot;icon-rate&quot;}]" Heading="COVID-19 AND HUMAN RIGHTS" SubHeading="Stay Informed, Get Inspired, Take Action" YellowBackground="1" GreyBackground="0" BlackBorders="0" />

I've been able to parse the "COVID-19 AND HUMAN RIGHTS" heading and the "Stay Informed, Get Inspired, Take Action" easily using the following XPATH rules:

//UMBRACO_MACRO[@macroAlias="Button"]/@Heading

//UMBRACO_MACRO[@macroAlias="Button"]/@SubHeading

However, parsing the "Read more" bit is more challenging, as it's embedded in what looks like some sort of JSON-formatted value for the Url attribute. I can capture the entire contents of the Url attribute, but the result is not terribly pleasant to work with. Ideally, I'd like to grab just the text between the "name" and "url" key values (highlighted in blue) and, after playing around with XPATH Tester, I came up with an XPATH expression that should theoretically do just that:

substring-before(substring-after(//UMBRACO_MACRO[@macroAlias="Button"]/@Url, 'name'), 'url')

According to XPATH Tester, this should resolve to ":"Read more",". It's not perfect —I'd still like to get rid of those initial quotes and punctuation— but it's a start.

However, for some reason, Studio does not seem to like this expression and doesn't pick up that text for that XPATH rule, despite the fact that I've marked it as "Always Translatable".

Am I doing anything obviously wrong here, or is it simply a matter of substring-before and substring-after not being supported in Studio XPATH? I remember seeing a thread that mentioned the comment() function was not supported as it belongs to the XPATH 2.0 standard, but substring-before and substring-after are part of XPATH 1.0.4.2 so they should be fine... right?

Any advice would be much appreciated — thanks in advance!

Fran

Parents
  • In the Studio help the use of XPath is explained as this:

    "SDL Trados Studio uses XPath to specify the applicable nodes."

    What it doesn't do is allow you to define what translatable text you would like to have extracted as part of an element or attribute rule.  To do this you'd have to use the embedded content processor but unfortunately that doesn't work on attribute values.  You can vote for this...

    https://community.sdl.com/ideas/translation-productivity-ideas/i/trados-studio-ideas/support-handling-of-embedded-content-in-xml-attributes

    So, the only sensible solution I can think of would be to use something like the Cleanup Tasks app, or my preference the SDL Data Protection Suite and then you could achieve something like this perhaps:

    Screenshot of Trados Studio showing XML code with elements and attributes related to COVID-19 and human rights campaign.

    I did this using the SDL Data Protection Suite and applied the two rules you created already and then added one to extract the URL attribute and then tagged it retrospectively with the SDL Data Protection Suite.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 4:21 AM (GMT 0) on 5 Mar 2024]
Reply
  • In the Studio help the use of XPath is explained as this:

    "SDL Trados Studio uses XPath to specify the applicable nodes."

    What it doesn't do is allow you to define what translatable text you would like to have extracted as part of an element or attribute rule.  To do this you'd have to use the embedded content processor but unfortunately that doesn't work on attribute values.  You can vote for this...

    https://community.sdl.com/ideas/translation-productivity-ideas/i/trados-studio-ideas/support-handling-of-embedded-content-in-xml-attributes

    So, the only sensible solution I can think of would be to use something like the Cleanup Tasks app, or my preference the SDL Data Protection Suite and then you could achieve something like this perhaps:

    Screenshot of Trados Studio showing XML code with elements and attributes related to COVID-19 and human rights campaign.

    I did this using the SDL Data Protection Suite and applied the two rules you created already and then added one to extract the URL attribute and then tagged it retrospectively with the SDL Data Protection Suite.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 4:21 AM (GMT 0) on 5 Mar 2024]
Children