How to use regular expressions in XML file types to mark placeholders?

I created a custom XML file type for an XML file with embedded HTML content and would like to mark formatting strings enclosed in rectangular brackets as placeholders.

Here's an excerpt:

<content:encoded><![CDATA[<p>This is [B]an[/B] example.</p>]]></content:encoded>

What kind of XPATH query will I need to use to select all strings enclosed by rectangular brackets?

Parents Reply
  • I think you are expecting too much from the Studio parser. There is an XML parser to handle the XML content and an HTML parser to handle the embedded content. If you have custom requirements, you might have to find a custom solution. One simple way might be to turn the [B]stuff[/B] into tags. If you pre-process your files to arrive at something like this:

        <content><![CDATA[<p>This is <custom_content value="[B]an[/B]"/> example.</p>]]></content>

    (I used the regex (\[B\].*?\[\/B\]) for matching and <custom_content value="$1"/> for replacing.)

    How easy that is depends a bit on the content: Is it always [B] or can it be [All kind of things]? What is enclosed? If it's characters like <, ", ', & the content might need escaping.

    If it is as simple as it is in your sample, then the above will show in the editor a like that:

    Daniel

    EDIT: I should add that if you go this route, you will have to post-process the files accordingly. Just to state the obvious.

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 4:37 AM (GMT 0) on 5 Mar 2024]
Children