Overview
SDLTM Anonymizer is an SDL Trados Studio plugin with features to anonymize personal data within segments, system fields and custom fields in Translation Memories.
Data anonymization is done for the purpose of protecting private or personally identifiable data by substituting it in the Translation Units with placeholders, while still maintaining the integrity of the data itself.
The content for anonymization is identified by Regular Expressions that match paterns such as EMails, phone numbers, IP addresses, social security numbers etc.... and includes an integrated preview that displays a filtered list of Translation Units whose content matched the search criteria to further select and confirm the content that is anonymized prior to launching the process.
Support for working with both server and file-based Translation Memories is available.
Installation Instructions
The application is installed by double clicking the sdlplugin file available through the SDL AppStore.
Terms & Conditions
When the application loads for the first time a window will appear asking you to accept the Terms & Conditions.
After you agree by selecting the checkbox option I agree, you'll the be able to see and work with the features from SDLTM Anonymizer. If you hit Close button by mistake prior to selecting the option I agree, then it will reappear again the next time you launch Studio.
Settings
A backup of the translation memory is performed prior to applying any changes. This option is checked by default, but the user can turn this feature off if not required.
- File-based: a full copy of the local TM (*.sdltm) is copied to the backup path.
- Server-based: a full TM export (*.tmx) is performed and saved in the backup path.
A report is created to record all changes applied to the TM and saved in the path indicated by the user.
Navigation / Translation Memories
SDLTM Anonymizer supports working with both server and file-based Translation Memories. To anonymize content in a TM, it will first need to be added to the list (see Add Translation Memory), and then loaded into memory, by selecting the Load checkbox.
Add Translation Memory
File-based (*.sdltm)
- Select the Add file-based TM button from the Translation Memories ribbon
Browse and select the Translation Memories from the file-system to added to the list.
- Select the Select Folder button from the Translation Memories ribbon
Browse and select the folder on the file system that contains SDL Translation Memories
Note: all TM files with the extension (*.sdltm) are added to the list.
- Drag and Drop the Translation Memories (*.sdltm) in the navigation area.
Example
Server-based
- Select the Add server TM button from the Translation Memories ribbon
- Provide the UrI and credentials to log in to the server where the Translation Memories reside
- Next, select the Translation Memories that you want to add, from the list of memories available in the Select Server-based Translation Memory window, and click OK
Anonymizing data
Content Filtering Rules
The Content Filtering Rules tab manages the criteria we use to match content within the segments that we want to anonymize. A default list of rules will be added the first time you launch the plugin with some common regular expressions (e.g. Email address, IP4 Address, MAC address, Social Security Numbers etc..)
Add a new rule
- Select the Click here to add a new rule button in the data grid, or use the hot key [Ctrl + N]
- Provide the criteria for the Rule and Description
- Select the Save button
Edit rule
The rule and description can be edited inline within the data grid. Simply select the cell and start typing.
Delete rule
Select the rule in the data grid and then click on the Delete button or hit the [Delete] key on your keyboard. To delete multiple rules, select them from the data grid individual or all of them [Ctrl + A] and then selecting the Delete button or hitting the [Delete] key.
Order
To change the order of a rule, simply select the up/down arrow keys associated with each rule in the data grid. The order in which the rules are executed is important to ensure the right content is matched in the right order, given the criteria used.
Let’s assume we have an email address in the segment that contains some numbers and we have two rules, one to match email format and another to match number format. Now, if we executed the regular expression for matching the numbers first, then the rule for matching the email would no longer be successful, as the email format would have been changed (e.g. numbers would have been replaced with a tag) and no longer resemble a valid e-mail convention.
Export rules
To export rules, select them from the data grid and click on the Export button from the Actions ribbon. The selected rules will then be saved in an excel file with the following structure. You can then use this file to distribute the rules to other teams or possibly use as a backup for your own requirements, later to be imported for other projects.
Import rules
To import rules, select the Import button from the Actions ribbon. If the ID or Rule from the imported data matches that of an existing rule in the data grid, then the existing rule will be updated, otherwise a new rule will be created and added to the list.
Preview Changes
The Preview changes window enables you to review the content that is matched by rules before applying any changes to the Translation Memory. To preview the content that is matched by the rules, select the Preview Changes button from the Actions ribbon.
Note: Ensure that you have at least one Translation Memory loaded from the navigation list and one or more rules selected from the Content Filter Rules tab.
A new window will appear with the content that matched the criteria of the rules that you selected, highlighted with an orange/brownish background. Select the checkbox associated with the segments whose content should be anonymized individually or select all by checking the Apply checkbox in the column header.
As well as the rules automatically identifying the content to be anonymized, you can mark-up additional content for anonymization directly in the preview window, by selecting the appropriate content of the segment in the Source or Target and then choosing the option “Select content for anonymization” from the context menu, as follows:
Select the button Apply changes to start the process of anonymizing the data in the Translation Memory.
Example
System Fields
Once you load the Translation Memory, a unique list of the User Names that are found in the TM will be populated in the System Fields data grid. These fields are recovered from the Created By, Modified By and Last Used By properties that are associated with each of the Translation Units in the TM.
To update the user name value of the System Field, simply provide a new value and select the Change checkbox for that field, as demonstrated underneath.
Select the button Apply changes from the Actions ribbon to start the process of updating the System Fields in the Translation Memory.
Export System Fields
To export system fields, select them from the data grid and click on the Export button from the Actions ribbon. The selected system fields will then be saved in an excel file with the following structure. You use file as a backup for your own requirements, or possibly later to be imported for other projects.
Import System Fields
To import system fields, select the Import button from the Actions ribbon. If the User Name from the imported data matches that of an existing System Field in the data grid, then the New Value will be imported, otherwise it will be ignored.
Custom Fields
Once you load the Translation Memory, all of the custom fields found in that TM will be populated in the Custom Fields data grid. The custom fields can be of type (SinglePickList, SingleString, MultiplePickList, MultipleString, Integer and DateTime) and depending on the type, the field may contain more than one values.
To update the custom field value, select the field from the left pane, and then provide a New Value for the field value on the right, remembering to also check the Change checkbox for those field values; make reference to the following screenshot for example.
Select the button Apply changes from the Actions ribbon to start the process of updating the Custom Fields in the Translation Memory.
Export Custom Fields
To export custom fields, select them from the data grid and click on the Export button from the Actions ribbon. The selected custom fields will then be saved in an excel file with the following structure. You use this file as a backup for your own requirements, or possibly later to be imported for other projects.
Import Custom Fields
To import custom fields, select the Import button from the Actions ribbon. If the Name, Type and Value from the imported data matches that of an existing Custom Field in the data grid, then the New Value will be imported, otherwise it will be ignored.
Report Log
All changes applied to the Translation Memories are recorded in xml log files and saved in the location indicated by the user from the settings. These reports are added to the Log Report view when you load a TM in the plugin.