Cannot get content between square brackets in XML file to be processed as embedded content

Hello folks,

I have an XML file in which there is content between square brackets, which I don't want to appear in the editor (or at least be converted to tags).

Here's what the XML looks like:

<?xml version="1.0" encoding="UTF-8"?>
<nodes>
<node name="XXXX1">
    <entry key="description">
[table id=34096]
    [row id=34345054]
        [cell id=3458659][b]lorem ipsum[/b][/cell]
        [cell id=4564567][b]lorem ipsum 2[/b][/cell]
        [cell id=3458936][b]Lorem ipsum 3[/b][/cell]
    [/row]
[/table]
</entry>
  </node>
  </nodes>

This is my first time trying this out so I tried to follow Paul's procedure from this topic but to no avail so far. Here's what I did:

- Create a new embedded file type processor by copying the Plain Text embedded content file, and adding the opening and ending regexp rules for the brackets:

Trados Studio screenshot showing the embedded content file type settings with a focus on SQUARE BRACKETS CONTENT and regular expression rules for brackets.

- Create a new XML file type and selecting this new content processor:

Trados Studio preview window displaying XML content with square brackets, indicating that the regular expression rules may not be applied correctly.

The preview on the XML file shows that I've not quite done things right, any idea as to what?

Thanks a lot,

Romain



Generated Image Alt-Text
[edited by: RWS Community AI at 10:49 AM (GMT 0) on 14 Nov 2024]
emoji
  • Hi ,
    Try this regex: [?<=\[][^\]]+[?=\]] in Project Settings.
    Select 'Placeholder' as a tag type.
    Trados Studio Project Settings window showing Embedded Content section with regex entered in Tag definition rules and 'Placeholder' selected as Tag Type.
    Keep in mind that you have to re-import your source file(s) to make sure that the changes have applied. It means that you have to remove sdlxliff files both from the source and target panes in Trados. Then drag and drop your file(s) again into the source pane and run the 'Prepare without project TM' batch task. This is how you re-build both sdlxliff files with the new embedded content rule.

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 3:10 PM (GMT 0) on 1 Mar 2024]
  • Hi and thank you for your answer.

    I am not sure which file type I should be adding this rule to, is it the "XML: AnyXML" filetype?

    emoji
  • Hi 

    From your screen prints I think you are exploring features of XML settings that wont give you want you need - for example your example has nothing to do with CDATA

    What  provided was perfect. Now its just a case of implementing it correctly.

    Please find attached my sample file + file type settings (that you can import and use) which I should give you what you are looking for  

    The key area of note is that I added context to //entry
    Trados Studio parser settings showing rules for XML tags with context set to 'Cell'.

    This context was where I added embedded content and defined the regular expression given by the super helpful  and reviewed your segment rules

    Trados Studio embedded content settings with a regular expression rule highlighted and a segmentation hint set to 'Exclude'.

    At embedded content level you then have segmentation rules (Option A/B) that help decide if the content should be excluded or included with the tag

    Option A

    Preview of XML content in Trados Studio with tags like 'b' and 'cell' visible around text 'lorem ipsum'.

    Option B (optimal from my perspective) 
    Final preview in Trados Studio showing clean segmentation of text 'lorem ipsum' without XML tags.

    Community Sample.sdlftsettings

    Have a good day

    Lyds

    Lydia Simplicio | RWS Group

    _______
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 7:36 AM (GMT 0) on 29 Feb 2024]
  • Hi ,

    Thank you for your detailed answer, finally being able to make this work is such a relief!

    emoji
  • So happy Raised hands tone3 you are having a period of relief and able to continue Punch tone3  

    Lydia Simplicio | RWS Group

    _______
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
  • Hello Ludia

    i have a similar problem

    Created a new XML file type from XML - Any XML to catch only spcific xml strings and not others - working

    Trados Studio Project Settings showing XML file type rules with 'introtext' highlighted as 'Not translatable'.

    Created new Embedded Content PRocessor to tag the soup of html tags inside these strings in the middle of text - not working

    Trados Studio Editor displaying XML code with embedded HTML content, highlighting the 'Ekon Curso Embedded Content' processor.

    But it does work in text when i test it in Trados Editor so i think its nothing to do with the regex - i believe...

    Screenshot of Trados Studio with an XML file open, showing HTML tags within the text. A 'Find and Replace' dialog box is open with a regex query entered.

    im saying the new settings to use this Embedded Content PRocessor this way

    Trados Studio Project Settings with 'Embedded content' selected, showing 'Ekon Curso Embedded Content' processor in use.

    the result is still empty of Tags - and i desperatly need to have this soup of tags tagged

    What am i doing wrong? Can you give a clue?

    thanks

    fils used in the link below

    /cfs-file/__key/communityserver-discussions-components-files/90/community.zip

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 3:10 PM (GMT 0) on 1 Mar 2024]
  •  

    You have defined the extra content in the HTML parser, but you seem not to be using it at all. You must go to embedded content processing of your xml file type, then select html there. And before you do so, you need to some structure information to your parser rules in xml to make html parsing possible there.

    _________________________________________________________

    When asking for help here, please be as accurate as possible. Please always remember to give the exact version of product used and all possible error messages received. The better you describe your problem, the better help you will get.

    Want to learn more about Trados Studio? Visit the Community Hub. Have a good idea to make Trados Studio better? Publish it here.

    emoji
  • Hi Jerzy

    now im confused :-) below is my Embeded Content Structure with a regex - i copied plain text one and changed it

    Trados Studio Project Settings window showing Embedded Content Processors for Microsoft Excel with empty regex fields for opening and closing patterns.

    Trados Studio Project Settings window displaying Inline tags for HTML 5 with a placeholder tag pattern defined as not translatable.

    and then i use this one in the fyle type for XML - also copied from XML - Any XML and changed

    Trados Studio File Types settings for XML Global Ekon cursos showing file type name, icon, identifier, and empty description field.

    Trados Studio Detection settings for XML Global Ekon cursos with criteria for XML documents recognition listed but no specific settings highlighted.

    Trados Studio Parser rules for XML Global Ekon cursos with various XML elements and attributes, highlighting 'introbacktext' as always translatable.

    In what stage im failing ?

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 7:37 AM (GMT 0) on 29 Feb 2024]
  • and im calling this embedded processor in thys file type

    Trados Studio Project Settings window showing 'Embedded content' selected with options to process embedded content using the processor 'Ekon Cursos Embedded Content'.

    bt the way, how can i create that structure you mention?

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 7:37 AM (GMT 0) on 29 Feb 2024]
  •  

    In the parser rules chose the corresponding rule, select edit and go to structure, edit, then select for example "Paragraph":

    Trados Studio parser rules interface showing the selection of the 'Paragraph' element rule with an arrow pointing to the 'Edit' button for structure information.

    In the embedded content part you MUST add this structure info, as otherwise no embedded content processing will take place.

    _________________________________________________________

    When asking for help here, please be as accurate as possible. Please always remember to give the exact version of product used and all possible error messages received. The better you describe your problem, the better help you will get.

    Want to learn more about Trados Studio? Visit the Community Hub. Have a good idea to make Trados Studio better? Publish it here.

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 7:37 AM (GMT 0) on 29 Feb 2024]
1 2