Not Considering

See conversation for reasoning why it's not possible for a CAT tool to address this kind of thing happening upstream.

Having said that, a potential option might be apps such as https://appstore.rws.com/Plugin/23 , 

Don't display bookmarks as tags in the Editor

Please add a possibility to avoid bookmarks from being displayed as tags in the source file section of the Studio editor.

  • Loving it, Jerzy! I used to program such macros and it's so great to see this kind of code Slight smile

  •  

    Here you go:

    Sub AlleTextmarkenLoeschen()
        For i = 1 To 100
            If ActiveDocument.Bookmarks.Count >= 1 Then
                ActiveDocument.Bookmarks(1).Delete
            End If
        Next i
    End Sub

    This macro removes all bookmarks in my Word (currently Office 365).

  • Actually you can easily remove most superfluous tags by a simple operation: select all, press Ctrl+D, navigate to the Advanced tab and set the first three fields (Scale, Spacing, Position) to 100%, Normal, Normal. This would remove the tag soup.
    And it is not always because of the authors switching between different colors but rather because the OCR software recognizes words and spaces between words in different styles and scales. When you assign a single style (100%, Normal, Normal) to the entire text, you remove the difference. 

  • I think it is important to be able to identify the tags and understand what they do in a sdlxliff file, which requires to show the extended form of tags.

    As many of you, I sometimes get dozens of style tags in a sentence (this is the same in memoQ, by the way), which, when you look at the Word document, is not justified if the paragraph is entirely written in black. Il often results from the author(s) inadvertently switching between the default color and black color, which does not change anything visually in the Word document, but creates many useless tags in both the Word file and the sdlxliff file. Because, even if one cannot see them, a Word document is full of tags, and those are the tags you then find in the Studio file.

     
    Typically, I do not reproduce color tags in the target language, unless they are other than black (which you can check in the Word document), or, if there are too many of them, I move one opening tag <cf style=…> to the beginning of the sentence and one closing tag </cf> at the end of it, then eliminate the other like tags. 

    You may also have, for instance, a pair of bold/italic/underlined tags surrounding a blank space, which denotes the style was not removed for the space when the author removed the style from the words before or after it. In such case, you can eliminate the pair of tags.

    The Word document will not suffer of it, and, conversely, will be technically rationalized.

    That said, it is key to always keep the bookmarks.