Segmentation - Ignore text in square brackets

Hi there,

I have an Excel file that looks like this:

[Do not translate]Translate, translate, translate.[Do not translate]Translate.[Do not translate]Translate, translate, translate!

Per default, Studio identifies that as one segment. However, I want Studio to segment it like this:


[Do not translate]

Translate, translate, translate.

[Do not translate]

Translate.

[Do not translate]

Translate, translate, translate!


I'd like to keep the text in square brackets visible in the file, because it contains context information. I can always lock these segments so they don't hinder my translation flow.

I can think of two ways of getting Studio to segment the file like this - either adjusting the segmentation rules of the TM, or adjusting the file-type definition.


Any pointers on which option is preferable and *how* to actually do it?

Parents Reply
  • Then you simply use the XML parser and set up its embedded content parser like this:

    Start tag: left square bracket, followed by one or more "not right bracket" characters, followed by right square bracket

    \[[^\]]+]

    End tag: left square bracket, followed by slash, followed by one or more "not right bracket" characters, followed by right square bracket

    \[\/[^\]]+]

    Leaving the default segmentation hint "May Exclude" should be okay, as long as you DON'T need to see the tags in editor.

    Using the XML parser with legacy embedded content parser would be probably better in this case as this would result in "cleaner" content in editor, without gazzillion of "internal file separators".

Children