Custom xml filetype: filter by attribue value (specific IDs) of parent element

Question

Hi Paul , I got a specifc urgent question. I got a survey xml from my client. I created a custome xml filetype for it to grab all translatable strings, but I also got a file with questions (IDs) to exlude. Is it possible to filter these via filetype? The inline elements to be translated are

and and they are nested within several parent elements of which the main one is . So my question is, it is possible to set up a filter to say only show

and for translation if attribute number is not any of listed ones (e.g. (regex) (258|259|260|1130|2255))? Or maybe add a new element around them, so they won't be listed for translation (I would then have to remove these before returning the translation)? Best regards, Pascal

Paul · Answer

Pascal Zotto Always helps to have a sample to play with. But perhaps this will help. I created this sample and the expressions with the help of ChatGPT:

What is the name of the fire-breathing creature in Greek mythology?

Hint: It has the body of a lion and the tail of a serpent.

Is the Phoenix a creature that is reborn from its own ashes?

True or False?

Which creature in Norse mythology is known as the world serpent?

Hint: Its name starts with a 'J'.

What is the name of the one-eyed giants in Greek mythology?

Hint: It starts with 'C'.

Are Unicorns considered mythical creatures in every culture?

True or False?

What is the name of the multi-headed dog guarding the underworld in Greek mythology?

Hint: It starts with 'C'.

Which creature in Chinese mythology is known for its power over water?

Hint: It's a dragon.

Is the Kraken a legendary sea monster of gigantic size in Scandinavian folklore?

True or False?

What is the name of the bird in Egyptian mythology that symbolizes the sun, creation, and rebirth?

Hint: It starts with 'B'.

Is Bigfoot considered a mythical creature?

True or False?

Which creature in Japanese mythology is a turtle-like creature often depicted with a tail and long neck?

Hint: It starts with 'K'.

What is the name of the half-man, half-horse creatures in Greek mythology?

Hint: It starts with 'C'.

Is Medusa a mythical creature with snakes for hair in Greek mythology?

True or False?

What is the name of the legendary creature in Irish folklore known for its shape-shifting abilities?

Hint: It starts with 'C'.

Which creature in Hindu mythology is depicted as a large serpent that surrounds the world?

Hint: It starts with 'S'.

Are Goblins considered mythical creatures that are mischievous and troublemakers?

True or False?

What is the name of the half-man, half-goat creatures in Greek mythology?

Hint: It starts with 'S'.

Is the Loch Ness Monster a mythical creature believed to inhabit the waters of Loch Ness in Scotland?

True or False?

Which mythical creature is known for luring sailors to their doom with their enchanting voices in Greek mythology?

Hint: It starts with 'S'.

Is the Griffin a mythical creature with the body of a lion and the head of an eagle?

True or False?

What is the name of the mythical creature in Slavic folklore known for its ability to control the weather and bring storms?

Hint: It starts with 'B'.

Is the Manticore a mythical creature with the body of a lion, a human head, and a scorpion tail?

True or False? The XPath expression to select the

and elements for translation, except for the ones whose parent has an attribute number equal to 561, 763, 1066, or 1268 would be this for example: //question[not(@number=561 or @number=763 or @number=1066 or @number=1268)]/content/*[self::p or self::computed] //question : It selects all elements in the XML document. [not(@number=561 or @number=763 or @number=1066 or @number=1268)] : Filters the elements that don't have a number attribute equal to 561, 763, 1066, or 1268. /content : Selects the child element of the filtered elements. /*[self::p or self::computed] : Selects the child elements of the elements that are either

or . I don't think regex will work for your use case. But to try and make it easier to add lists of exclusions I pressed ChatGPT for a better answer: //question[not(contains('|561|763|1066|1268|', concat('|', @number, '|')))]/content/*[self::p or self::computed] //question : It selects all elements in the XML document. not(contains('|561|763|1066|1268|', concat('|', @number, '|'))) : concat('|', @number, '|') : Concatenates the current number attribute value with pipe characters | on both sides. contains('|561|763|1066|1268|', ...) : Checks if the concatenated string is present in the list of exclusions (the pipe-separated string). not(...) : Filters the elements that don't match the exclusions. /content : Selects the child element of the filtered elements. /*[self::p or self::computed] : Selects the child elements of the elements that are either

or . To add more exclusions, simply append them to the pipe-separated list, like |561|763|1066|1268|NewExclusion| . That may not be exactly what you were after as you didn't provide a sample file, but perhaps it'll help you to handle the file you have? But both expressions worked well for me in Trados Studio with just two rules... for example;

Paul · Answer

Pascal Zotto ok - using this file I did two things. First I used this rule to see how many should be extracted with no exclusions: //question/*[self::headline or self::choices]/text/p This gave me 57 segments including those with "not this one" in the

elements. Then I used this rule: //question[not(contains('|526|811|', concat('|', @number, '|')))]/*[self::headline or self::choices]/text/p Now I get 44 segments as I excluded the 3 here: and the 10 here: So I think that solves it based on my understanding so far: Two rules: Always translatable //question[not(contains('|526|811|', concat('|', @number, '|')))]/*[self::headline or self::choices]/text/p Not translatable //* Let me know if that works for you too? Using the contains() and concat() functions should make it simple to manage if you have a lot of exclusions.

Paul · Answer

Pascal Zotto 
 I doubt that! Here's a video that might help you to see what you are doing that I didn't... or vice versa.

Paul · Answer

Pascal Zotto 
 Then you have three ways to tackle it. 
 Way #1 
 Add the structure you assigned to your parser rule (you probably have t do that as we didn't use it yet). 
 
 Then add the structure here in 1. 
 
 Finally configure the rules to apply to the structure: 
 
 Way #2 
 Use the plain text processor for the embedded content instead: 
 
 Then create you rules in the plain text embedded processor. 
 Way #3 
 Use the Multilingual XML filetype and select the "Treat as Monolingual" option: 
 
 Now you can use the html embedded content processor AND add regex rules for your variables:

Paul · Answer

Pascal Zotto 
 Pascal Zotto said: is it even possible to match more than one of the same subelement with Multilingual XML parser as to me the rule which defines which language is to be found in which subelement only "allows" one element per language. 
 It depends. You can of course match more than one element, it all depends on your rule. It's not intended to be a complete replacement from the xml filetype as that sort of flexibility around extracting whatever you like with a parser rule obviously won't eork, but for simple structures like yours it seem possible. For example... using this as the "Languages Root": 
 /survey/section/question[not(contains('|526|811|', concat('|', @number, '|')))]/*[self::headline or self::choices]/text 
 And this as the language: 
 p 
 Seems to work... I added a couple of placeholders and used the embedded html for the ones you added: 
 
 I didn't spend a lot of time on this so you'd need to check thoroughly if anything was missing, but it seems possible with this quick check.