Prevent line-breaks in regex-based file type

I'm trying to create a regex-based HTML parser for a template language that contains HTML fragments.
(I can't use the built-in HTML file types for this.)

The settings are:
Trados Studio regex-based parser settings with multiline option checked for opening and closing patterns.
Trados Studio rules configuration showing HTML tags as tag pairs and DOCTYPE as placeholder.
Trados Studio line breaks configuration with 'Remove Line Breaks' option selected.
When I tested this with the following HTML file:

<!DOCTYPE html>

<html>
    <head>
        <title>test</title>
    </head>

    <body>
        <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis interdum est ut bibendum
        rutrum. Vestibulum luctus, nibh ac viverra molestie, tortor neque fringilla justo, ut venenatis
        arcu sapien quis lectus.
        </p>
    </body>
</html>

I noticed that the line-breaks in the <p> tag were maintained even though Remove Line Breaks was selected.

Segmented text in Trados Studio with line breaks maintained in paragraph content despite 'Remove Line Breaks' setting.

What setting do I need to change to have Studio split the <p> contents in only three segments?
(I'm trying to duplicate the segmentation algorithm of the built-in HTML file types.)



Generated Image Alt-Text
[edited by: Trados AI at 4:39 AM (GMT 0) on 5 Mar 2024]
emoji