Custom xml filetype: filter by attribue value (specific IDs) of parent element

Hi ,

I got a specifc urgent question. I got a survey xml from my client. I created a custome xml filetype for it to grab all translatable strings, but I also got a file with questions (IDs) to exlude. Is it possible to filter these via filetype?

The inline elements to be translated are <p> and <computed> and they are nested within several parent elements of which the main one is <question number="x" type="xy">.

So my question is, it is possible to set up a filter to say only show <p> and <computed> for translation if <question …> attribute number is not any of listed ones (e.g. (regex) (258|259|260|1130|2255))?

Or maybe add a new element <exclude> around them, so they won't be listed for translation (I would then have to remove these before returning the translation)?

Best regards,

Pascal



better formulation of question while using correct terminology
[edited by: Pascal Zotto at 1:49 PM (GMT 1) on 15 May 2023]
emoji
Parents
  •  

    Always helps to have a sample to play with.  But perhaps this will help.  I created this sample and the expressions with the help of ChatGPT:

    <quiz>
        <question number="258" type="MCQ">
            <content>
                <p>What is the name of the fire-breathing creature in Greek mythology?</p>
                <computed>Hint: It has the body of a lion and the tail of a serpent.</computed>
            </content>
        </question>
        <question number="359" type="TF">
            <content>
                <p>Is the Phoenix a creature that is reborn from its own ashes?</p>
                <computed>True or False?</computed>
            </content>
        </question>
        <question number="460" type="MCQ">
            <content>
                <p>Which creature in Norse mythology is known as the world serpent?</p>
                <computed>Hint: Its name starts with a 'J'.</computed>
            </content>
        </question>
        <question number="561" type="MCQ">
            <content>
                <p>What is the name of the one-eyed giants in Greek mythology?</p>
                <computed>Hint: It starts with 'C'.</computed>
            </content>
        </question>
        <question number="662" type="TF">
            <content>
                <p>Are Unicorns considered mythical creatures in every culture?</p>
                <computed>True or False?</computed>
            </content>
        </question>
        <question number="763" type="MCQ">
            <content>
                <p>What is the name of the multi-headed dog guarding the underworld in Greek mythology?</p>
                <computed>Hint: It starts with 'C'.</computed>
            </content>
        </question>
        <question number="864" type="MCQ">
            <content>
                <p>Which creature in Chinese mythology is known for its power over water?</p>
                <computed>Hint: It's a dragon.</computed>
            </content>
        </question>
        <question number="965" type="TF">
            <content>
                <p>Is the Kraken a legendary sea monster of gigantic size in Scandinavian folklore?</p>
                <computed>True or False?</computed>
            </content>
        </question>
        <question number="1066" type="MCQ">
            <content>
                <p>What is the name of the bird in Egyptian mythology that symbolizes the sun, creation, and rebirth?</p>
                <computed>Hint: It starts with 'B'.</computed>
            </content>
        </question>
        <question number="1167" type="TF">
            <content>
                <p>Is Bigfoot considered a mythical creature?</p>
                <computed>True or False?</computed>
            </content>
        </question>
        <question number="1268" type="MCQ">
            <content>
                <p>Which creature in Japanese mythology is a turtle-like creature often depicted with a tail and long neck?</p>
                <computed>Hint: It starts with 'K'.</computed>
            </content>
        </question>
        <question number="1369" type="MCQ">
            <content>
                <p>What is the name of the half-man, half-horse creatures in Greek mythology?</p>
                <computed>Hint: It starts with 'C'.</computed>
            </content>
        </question>
    <question number="1470" type="TF">
        <content>
            <p>Is Medusa a mythical creature with snakes for hair in Greek mythology?</p>
            <computed>True or False?</computed>
        </content>
    </question>
    <question number="1571" type="MCQ">
        <content>
            <p>What is the name of the legendary creature in Irish folklore known for its shape-shifting abilities?</p>
            <computed>Hint: It starts with 'C'.</computed>
        </content>
    </question>
    <question number="1672" type="MCQ">
        <content>
            <p>Which creature in Hindu mythology is depicted as a large serpent that surrounds the world?</p>
            <computed>Hint: It starts with 'S'.</computed>
        </content>
    </question>
    <question number="1773" type="TF">
        <content>
            <p>Are Goblins considered mythical creatures that are mischievous and troublemakers?</p>
            <computed>True or False?</computed>
        </content>
    </question>
    <question number="1874" type="MCQ">
        <content>
            <p>What is the name of the half-man, half-goat creatures in Greek mythology?</p>
            <computed>Hint: It starts with 'S'.</computed>
        </content>
    </question>
    <question number="1975" type="TF">
        <content>
            <p>Is the Loch Ness Monster a mythical creature believed to inhabit the waters of Loch Ness in Scotland?</p>
            <computed>True or False?</computed>
        </content>
    </question>
    <question number="2076" type="MCQ">
        <content>
            <p>Which mythical creature is known for luring sailors to their doom with their enchanting voices in Greek mythology?</p>
            <computed>Hint: It starts with 'S'.</computed>
        </content>
    </question>
    <question number="2177" type="TF">
        <content>
            <p>Is the Griffin a mythical creature with the body of a lion and the head of an eagle?</p>
            <computed>True or False?</computed>
        </content>
    </question>
    <question number="2278" type="MCQ">
        <content>
            <p>What is the name of the mythical creature in Slavic folklore known for its ability to control the weather and bring storms?</p>
            <computed>Hint: It starts with 'B'.</computed>
        </content>
    </question>
    <question number="2255" type="TF">
        <content>
            <p>Is the Manticore a mythical creature with the body of a lion, a human head, and a scorpion tail?</p>
            <computed>True or False?</computed>
        </content>
    </question>
    </quiz>

    The XPath expression to select the <p> and <computed> elements for translation, except for the ones whose parent <question> has an attribute number equal to 561, 763, 1066, or 1268 would be this for example:

    //question[not(@number=561 or @number=763 or @number=1066 or @number=1268)]/content/*[self::p or self::computed]

    1. //question: It selects all <question> elements in the XML document.
    2. [not(@number=561 or @number=763 or @number=1066 or @number=1268)]: Filters the <question> elements that don't have a number attribute equal to 561, 763, 1066, or 1268.
    3. /content: Selects the <content> child element of the filtered <question> elements.
    4. /*[self::p or self::computed]: Selects the child elements of the <content> elements that are either <p> or <computed>.

    I don't think regex will work for your use case.  But to try and make it easier to add lists of exclusions I pressed ChatGPT for a better answer:

    //question[not(contains('|561|763|1066|1268|', concat('|', @number, '|')))]/content/*[self::p or self::computed]

    1. //question: It selects all <question> elements in the XML document.
    2. not(contains('|561|763|1066|1268|', concat('|', @number, '|'))):
      • concat('|', @number, '|'): Concatenates the current number attribute value with pipe characters | on both sides.
      • contains('|561|763|1066|1268|', ...): Checks if the concatenated string is present in the list of exclusions (the pipe-separated string).
      • not(...): Filters the <question> elements that don't match the exclusions.
    3. /content: Selects the <content> child element of the filtered <question> elements.
    4. /*[self::p or self::computed]: Selects the child elements of the <content> elements that are either <p> or <computed>.

    To add more exclusions, simply append them to the pipe-separated list, like |561|763|1066|1268|NewExclusion|.

    That may not be exactly what you were after as you didn't provide a sample file, but perhaps it'll help you to handle the file you have?  But both expressions worked well for me in Trados Studio with just two rules... for example;

    Trados Studio parser rule configuration window with an XPath expression input to filter specific question elements for translation.

    Preview of an XML file in Trados Studio showing questions about mythical creatures with corresponding hints.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub



    Generated Image Alt-Text
    [edited by: Trados AI at 11:07 AM (GMT 0) on 29 Feb 2024]
  • Hi ,

    yes, sorry, I was completely in a hassle yesterday and forgot to add a sample and my filetype settings I have so far. I thought of these 2 hours ago.

    This is my first try to build such a filetype setting from scratch.

    Apparently we used different approaches to create the filetype settings as I don't get any results at all, when I add the rules to the parser. :( I have Trados 2022. I tried 2 filetype settings but don't get the correct result...

    Not sure what I did wrong when using the wizard to create the settings.2 xml file type settings.zip

    emoji
  •  

    I don't know about good... but I like to use this one:

    https://download.cnet.com/XPath-Visualizer/3000-7241_4-75804649.html

    I just found that link as I've had this tool for years and the original link which I shared in this article (https://multifarious.filkin.com/2015/11/07/x-files-ata56/ ) doesn't work anymore.  So take this at your own risk!

    If you take it note you have to press "Alt" every time you start it to access the menus.  Catches everyone out!

    Screenshot of XPath Visualizer Tool v1.3.0.6 with an XPath expression input field and an XML example displayed, showing various text elements and attributes.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 11:08 AM (GMT 0) on 29 Feb 2024]
  • unfortunately xpath visualizer does not work for me. It freezes as soon as I add some code or file. :(

    emoji
  •  

    Too bad... maybe take a look at some of the other tools I noted in that presentation.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
  •  

    Notepad++ has a plugin named XML Tools to get XPaths. Everything is free… not tested though.

    emoji
  • doh... I have that extension installed but did not think about it for xpath. I looked under notepad++ plugins for xpath first thing last week and did not get any result. Today I tried again and even got XPatherizerNPP oO

    Thanks for pointing me to try NP++ to look again

    Both plugins seem to work fine but have a different approach on showing the results.

    emoji
  •  

    I’m getting closer to get it work for Multilingual XML but I just saw a problem with regex for a Placeholder pattern. The regex is correct (tested in Notepad++ and RegexBuddy) but within the preview it does not match the expected string but only the beginning of it: <

    it should match <This string of whatever length>

    Regex used: ^((\s)?(&lt;|<).*(&gt;|>)?)$

    So instead of converting <This string of whatever length> to a tag it only converts <

    emoji
  •  

    You probably want this:

    ((\s)?(&lt;|<).*(&gt;|>)?)

    Although it doesn't seem too clever and might be what you really need at all.

    it should match <This string of whatever length>

    Your regex is greedy, so could fail you by picking up too much.  I have no idea what your files look like but this might be more appropriate:

    (<[^>]*?>|&lt;[^&]*?&gt;)

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
  • , it's as greedy as it needs to be as the rule says to match anything between < and > but only if at the beginning and end of segement/line respectively. ;) The problem is that it does not pick up enough as stated above as it stops right after <.

    I’ll try your rule at home while making it greedier again. ;)

    emoji
  • Your version does not solve the problem either. It still only matches the < at the beginning of the segment instead of getting everything until the > at the and of the segment. The strange thing is that it works well in all regex testers I used (Notepadd++, RegexBuddy,…). But once again, Trados does not like me with the rules ^^

    emoji
  •   

    Perhaps you can provide an example string that you’re working with so I can enjoy the same tests as you?

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
Reply Children