So what does this tool do?
- You can lock segments based on structure or content
- You can remove unwanted tags in the source
- You can modify the source or target text as you like and create “settings” files for easy reuse
- You can create tags for embedded xml or html content
- You can create placeholders for fixed words or phrases
Some of the above is possible already with other tools, but the best part is this is a Batch Task, so you can run it directly in Trados. If you think any of the above may be of interest, please read on.
New Batch Task Menu Items:
The tool adds 2 new items to your batch task menu:
Cleanup Source
When you click on Cleanup Source and then hit “Next”, you will be greeted with the following screen:
Locking segments
You can lock segments based on search expressions using the left-hand box (the Content Locker). In order to lock based on the document structure, use the right-hand box (the Structure Locker).
Content Locker Example
I mainly translate from Japanese to English and often times you get segments that contain no Japanese characters. It can be useful to lock these sometimes, the following regular expression would check for that: ^[^亜-熙ぁ-んァ-ヶ]+$
Make sure you turn on Regex
for the above to work
The headers in the above screenshot are abbreviated for space reasons, so they might be a little difficult to understand:
- Regex: Regular expression matching
- Case: Case-sensitive searching
- Whole: Whole word matching
Structure Locker Example
This should be straightforward, the structure info is read from the sdlxliff files of the project. The example file I used happens to be an Excel file, which is why you see items like sdl:worksheet and sdl:textbox. In the following screenshot I selected sdl:textbox to lock any text that appears in text boxes.
Removing tags
The plug-in divides tags into two categories, Formatting Tags and Placeholder Tags:
- Formatting Tags: These always start with
<cf>
.
<cf>
tags can contain a range of information such as font name, font size, italic, bold, etc. In Example 1 below, each tag contains the font name and size only, while Example 2 contains an italic="True"
attribute.
Example 1 (Font Name and Size):
Example 2 ( italic ="True")
In order to remove the tags in Example 1, you need to select Font Name and Font Size (see screenshot below), since the tag specifies both of these:
However, the tag in Example 2 will not be removed as it contains italic="True". To remove this tag, you also need to select Italic:
- Placeholder Tags:
In short, these are the <ph>
(Placeholder) tags in the sdlxliff file. Sometimes they contain inline formatting which may not be needed.
I would exercise caution when removing these tags though as often times they are necessary!
In the following screenshot, the <br>
tags are used for aligning text in text boxes in the original Excel file, they are probably required, but there might be times when you want to remove this type of formatting.