Segmentation - Ignore text in square brackets

Hi there,

I have an Excel file that looks like this:

[Do not translate]Translate, translate, translate.[Do not translate]Translate.[Do not translate]Translate, translate, translate!

Per default, Studio identifies that as one segment. However, I want Studio to segment it like this:


[Do not translate]

Translate, translate, translate.

[Do not translate]

Translate.

[Do not translate]

Translate, translate, translate!


I'd like to keep the text in square brackets visible in the file, because it contains context information. I can always lock these segments so they don't hinder my translation flow.

I can think of two ways of getting Studio to segment the file like this - either adjusting the segmentation rules of the TM, or adjusting the file-type definition.


Any pointers on which option is preferable and *how* to actually do it?

Parents Reply Children
  • This is the XML parser settings, here you configure "where the embedded content is stored inside the XML".

    But, as I mentioned, you should have perhaps better used the "XML (Legacy Embedded Content)" in this case... There you can define the tags directly in its settings.

    If you really want to experiment with the "new" XML parser, the process is a bit more complicated...
    First you should create the plain text embedded content 'file type' - go to Embedded Content Processors, make a copy of the Plain Text type, give it some nice name and define the tags there.
    Then go back to your XML parser settings, turn on the embedded content processing, select the plain text embedded content processor you just created from the dropdown, and then select "Element" from the dropdown on your screenshot - this will tell the parser to consider content of each parsed XML element to be "the one containing the embedded content" and "the one whose content will be sent to the embedded content processor for processing".
    And you should be done...