How to add a rule to segment code listings in HTML pre tags?

I believe this is the first time I run into this problem in many years:

I need to import and export the content of HTML <pre> tags with an attribute data-type="programlisting" so that each line will be parsed as one segment.

Source looks like:

<pre data-code-language="java" data-type="programlisting">

public class Song {

String name;

String authorName;

String albumName;

}</pre>

So far, during import, everything within the pre tag is parsed as one segment, which results in everything being output on one line when the translation is exported (please ignore that target and source content is identical here, this example does not contain any comments ...):

<pre data-code-language="java" data-type="programlisting">public class Song {String name; String authorName; String albumName; }</pre>

I tried to add a condition to a copy of the pre tag rule (which I prioritized and put before the existing pre tag rule) but it looks like there is no option to define the way the segment parsing is done.

Do I overlook something? Or is there any other way to deal with that problem? (My Xmas wish! :-) )

Thanks for any insights!

emoji
Parents
  • Hello,

    To segment code listings in HTML pre tags, you can use the "Embedded Content Processor" in Trados Studio. Here's how you can do it:

    Step 1: Open your project in Trados Studio.

    Step 2: Go to "Project Settings" and select "File Types".

    Step 3: Find and select "HTML" from the list of file types.

    Step 4: In the HTML settings, go to the "Embedded Content Processors" section.

    Step 5: Add a new rule for the "pre" tag with the attribute data-type="programlisting".

    Step 6: In the rule settings, set the "Segmentation" option to "Break".

    Step 7: Save your changes and close the settings.

    Now, when you import an HTML file with pre tags that have the attribute data-type="programlisting", each line within the pre tag should be parsed as a separate segment.

    Please note that this method will only work if the content within the pre tags is formatted with line breaks. If the content is all on one line, it will still be imported as a single segment.

    I hope this helps! If you have any other questions, feel free to ask.

    Best regards,

    RWS Community AI

    emoji
  • Hello,

    To segment code listings in HTML pre tags, you can use the "Embedded Content Processor" in Trados Studio. Here's how you can do it:

    Step 1: Open your project in Trados Studio.

    Step 2: Go to "Project Settings" and select "File Types".

    Step 3: Find and select "HTML" from the list of file types.

    Step 4: In the HTML settings, go to the "Embedded Content Processors" section.

    Step 5: Add a new rule for the "pre" tag with the attribute data-type="programlisting".

    Step 6: In the rule settings, set the "Segmentation" option to "Break".

    Step 7: Save your changes and close the settings.

    Now, when you import an HTML file with pre tags that have the attribute data-type="programlisting", each line within the pre tag should be parsed as a separate segment.

    Please note that this method will only work if the content within the pre tags is formatted with line breaks. If the content is all on one line, it will still be imported as a single segment.

    I hope this helps! If you have any other questions, feel free to ask.

    Best regards,

    TradosAI

    Thanks AI! :-)

    Re steps 4, 5, 6:

    I see an option "Other elements identified by" and then "Parser rules". Is this the correct one? When I add a rule for pre, I cannot specify an attribute and I don't see an option for Segmentation.

    emoji
  • Hello,

    I apologize for the confusion. The steps I provided earlier are a general guide and may not exactly match what you see in your version of Trados Studio.

    To specify an attribute for the "pre" tag and set the segmentation option, you should follow these steps:

    Step 1: Open your project in Trados Studio.

    Step 2: Go to "Project Settings" and select "File Types".

    Step 3: Find and select "HTML" from the list of file types.

    Step 4: In the HTML settings, go to the "Parser" section.

    Step 5: Find the "pre" tag in the list of elements and click "Edit Rule".

    Step 6: In the "Edit Rule" window, you can specify the attribute (data-type="programlisting") and set the "Segmentation hint" to "Include with text".

    Step 7: Save your changes and close the settings.

    Now, when you import an HTML file with pre tags that have the attribute data-type="programlisting", each line within the pre tag should be parsed as a separate segment.

    I hope this helps! If you have any other questions, feel free to ask.

    Best regards,

    RWS Community AI

    emoji
  • The option "Segmentation hint" is disabled, I guess i have to use "Tag Type" = "Inline"?

    emoji
  • Hello,

    Yes, you are correct. If the "Segmentation hint" option is disabled, you can change the "Tag Type" to "Inline". This should allow the content within the "pre" tag to be segmented line by line.

    Here's how you can do it:

    Step 1: Open your project in Trados Studio.

    Step 2: Go to "Project Settings" and select "File Types".

    Step 3: Find and select "HTML" from the list of file types.

    Step 4: In the HTML settings, go to the "Parser" section.

    Step 5: Find the "pre" tag in the list of elements and click "Edit Rule".

    Step 6: In the "Edit Rule" window, change the "Tag Type" to "Inline".

    Step 7: Save your changes and close the settings.

    Now, when you import an HTML file with pre tags that have the attribute data-type="programlisting", each line within the pre tag should be parsed as a separate segment.

    I hope this helps! If you have any other questions, feel free to ask.

    Best regards,

    TradosAI

    emoji
  • FINAL SOLUTION:


    Step 1: Go to "Project Settings" and select "File Types".

    Step 2: Find and select "HTML" from the list of file types.

    Step 3: In the HTML settings, go to the "Parser" section.

    Step 4a: Find the "pre" tag in the list of elements and click "Edit Rule".
    Step 4b: Find the "pre" tag in the list of elements, copy to a new rule, edit the rule.

    Step 5: Edit the condition to read 'pre data-type="programlisting"' (no outer single quotes).

    Step 6: Change "Tag Type" to "Inline"

    Step 7: Set the "Segmentation hint" to "Include with text".

    Step 8: In the HTML settings, go to the "Embedded Content" section.

    Step 9: Check "Other elements identified by" and "Parser rules", and add "pre" with "Embedded Content Plain Text v 1.0.0.0"

    Step 10: Load test file and check preview parsing.

    Step 11: Remove already parsed file(s) from project (source and targets) and add again.

    emoji
Reply
  • FINAL SOLUTION:


    Step 1: Go to "Project Settings" and select "File Types".

    Step 2: Find and select "HTML" from the list of file types.

    Step 3: In the HTML settings, go to the "Parser" section.

    Step 4a: Find the "pre" tag in the list of elements and click "Edit Rule".
    Step 4b: Find the "pre" tag in the list of elements, copy to a new rule, edit the rule.

    Step 5: Edit the condition to read 'pre data-type="programlisting"' (no outer single quotes).

    Step 6: Change "Tag Type" to "Inline"

    Step 7: Set the "Segmentation hint" to "Include with text".

    Step 8: In the HTML settings, go to the "Embedded Content" section.

    Step 9: Check "Other elements identified by" and "Parser rules", and add "pre" with "Embedded Content Plain Text v 1.0.0.0"

    Step 10: Load test file and check preview parsing.

    Step 11: Remove already parsed file(s) from project (source and targets) and add again.

    emoji
Children
No Data