Looking for 'special' numbers in the TM

In 2016, a long-standing client introduced a style guide for numbers/measures used in their manuals. Before then, they were using the AmE number 'spelling', i.e. 34,000 for 34 thousands, 12.15 for 12 units and 15 decimal points. In 2016 they decided to match some technical standard, for which the thousands separator should be a non-breaking space (i.e. 34 000) and the decimal separator should still be the point (12.15). They decided that also the translated manuals should stick to this rule, regardless of the local custom -- for example, in my country we use the comma as the decimal separator, but I should stick to the point (no pun intended) for this client.

After 5 years, my TM has a mix of bad sources and bad targets due to these style changes. I would like to fix my TM so that anything I pre-translate is pre-translating according to the new style guide.

 

1) How do I look for numbers such as 34,000 or 16,700 in the TM? I have tried [0-9],[0-9] to no avail. I'd rather avoid to export the TM in *.tmx in XBench as it is quite large -- I am sure Studio can handle this. But how?

2) How do I implement a QA (in either Studio or XBench) to check that the same number formatting in the source should match? I.e. if the source reads 12.15 the target should read 12.15 as well, and not 12,15 as it did before.

 

Thanks!

Parents Reply Children
  • Yes, and thank you so much for it. As Xbench has powerful search capabilities but is not an editor, it is possible to search directly in *.sdltm files (no need to convert them to *.tmx!) but any editing should be done in Studio. Josep also made a suggestion on this forum to add this editing capability in the future. Please vote it if you'd like to be able to edit *.sdltms more easily.

    So, even if I was finding all the instances that needed fixing, I was not able to run the same search in Studio as I could not use the same RegEx strings in the TM Editor. What I did was as follows:

    • For the thousands separator: I looked in the target for       .1 / .2 / .3 / .4 / ...  .0 and then run a find and replace (find the point, replace it with a non breaking space). So any 34.000 became 34 000.
    • The the decimal separator: I looked in the target for     ,1 / ,2 / ,3 / ,4 / ...  ,and then run a find and replace (find the comma, replace it with a point). So any 12,15 became 12.15

    Then I used Xbench to see if I had missed something, and manually fixed those few occurrences in Studio TM Editor.

     

    So thank you everybody, I am pretty happy with what I was able to do today. I fixed my entire TM and when the client will send their amended sources (i.e. in 2018 their documents read 34 000), pre-translating will already propose the correct translation (34 000).