Converting TM file into Excel. Set tags disappearing.

Hello, I was wondering if anyone knew a workaround to my problem.

So there are many TM files I have and I wanted to convert them into Excel. 

Obviously when I Batch task to export a project's excel file the excel file will have the Tags show up on both the Source and Target side of the excel

However, I have some old TM files where I don't have (or don't know where) the original file is so I wanted to convert that TM file into Excel using the Glossary Converter.
When I did, The tags whether they were color codings, or other variable codings (such as {0}) were removed on the exported excel.

I'm thinking that the tags were thought to be formatting tags so they were removed, but I need to be able to have them also converted and show up on Excel as if I was exporting the project file as stated first.

Does anyone know how I can go about this?

emoji
Parents
  •  

    Might be worth explaining why you want to do this?  The Glossary Converter was designed for a different purpose... the purpose being to get at the text.  So retaining the tags is normally undesirable.  If you can explain what your usecase is perhaps there is a better solution to your problem.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
  • For a game translation company we have many variables that show up within the text. 

    These variables are not only formatting texts such as <color#12312> </color> 
    Some of the strings will have variables such as {0} to to replaced within the game later on when put into the game.
    Such as
    "Player {0} will be kicked from the <color#12312>guild</color>, are you sure you want to continue?"  {0} = username
    "Would you like to purchase {1} {0} from the market?" {0} = item name / {1} quantity

    Sometimes there are string with many tags so when translating, games companies will usually, set these variables as tags so that it's quicker to insert it into the Target text and when running verification we are able to see if the source and target both include the variable(tags).

    That's the explanation as to why we use the tag function. As for why I'm trying to convert the TM into Excel is because original excel files (or the finished project excel files) from old projects that we have done are not available to us anymore because we cannot find them. However, we still have all the TM from those projects. So I wanted to convert the TM  into Excel so we can use the Excel to use as a data and organize them.

    I don't know if I explained it well to your understanding. 

    Please let me know if there's any other information you may need

    emoji
  •  

    ok - understood. 

    So I wanted to convert the TM  into Excel so we can use the Excel to use as a data and organize them.

    In this case your best bet is to export to TMX so the tag placeables will still be there and then bring the TMX into Excel so you retain the tags.  They won't be useful in Excel because the TM doesn't know what to do with the tags.. they are just placeables, but at least you'd know they were there.

    So if I take your example:

    Excel spreadsheet showing two rows of text with placeholder tags and color formatting tags within the sentences.

    Translate in Studio:

    Excel spreadsheet with two rows displaying source and target language content, including placeholder tags and color formatting tags, with NMT icons indicating translation status.

    Export from my TM as TMX:

    <?xml version="1.0" encoding="utf-8"?>
    <tmx version="1.4">
      <header creationtool="SDL Language Platform" creationtoolversion="8.1" o-tmf="SDL TM8 Format" datatype="xml" segtype="sentence" adminlang="en-US" srclang="en-US" creationdate="20240429T090048Z" creationid="GLOBAL\pfilkin">
        <prop type="x-Recognizers">RecognizeAll</prop>
        <prop type="x-IncludesContextContent">True</prop>
        <prop type="x-TMName">gaming</prop>
        <prop type="x-TokenizerFlags">DefaultFlags</prop>
        <prop type="x-WordCountFlags">DefaultFlags</prop>
      </header>
      <body>
        <tu creationdate="20240429T090031Z" creationid="GLOBAL\pfilkin" changedate="20240429T090055Z" changeid="GLOBAL\pfilkin" lastusagedate="20240429T090055Z">
          <prop type="x-LastUsedBy">GLOBAL\pfilkin</prop>
          <prop type="x-Context">0, 0</prop>
          <prop type="x-Origin">TM</prop>
          <prop type="x-ConfirmationLevel">Translated</prop>
          <tuv xml:lang="en-US">
            <seg>Player <ph x="1" type="0" /> will be kicked from the <bpt i="2" type="1" x="2" /><ept i="2" /></seg>
          </tuv>
          <tuv xml:lang="de-DE">
            <seg>Der Spieler <ph x="1" type="0" /> wird vom getreten <bpt i="2" type="1" x="2" /><ept i="2" /></seg>
          </tuv>
        </tu>
        <tu creationdate="20240429T090056Z" creationid="GLOBAL\pfilkin" changedate="20240429T090104Z" changeid="GLOBAL\pfilkin" lastusagedate="20240429T090104Z">
          <prop type="x-LastUsedBy">GLOBAL\pfilkin</prop>
          <prop type="x-Context">-xxxxxxxxxxxxxxxxxx, -xxxxxxxxxxxxxxxxxx</prop>
          <prop type="x-ContextContent">Player  will be kicked from the  | AgMHAwEBAQEwABkBAgIBATEAAAICAQExAA== | Der Spieler  wird vom getreten  | AgMMAwEBAQEwABMBAgIBATEAAAICAQExAA==</prop>
          <prop type="x-Origin">TM</prop>
          <prop type="x-ConfirmationLevel">Translated</prop>
          <tuv xml:lang="en-US">
            <seg>Would you like to purchase <ph x="1" type="2" /> <ph x="2" type="3" /> from the market?</seg>
          </tuv>
          <tuv xml:lang="de-DE">
            <seg>Möchten Sie auf dem Markt einkaufen <ph x="1" type="2" /> <ph x="2" type="3" /> ?</seg>
          </tuv>
        </tu>
      </body>
    </tmx>

    Create a macro in Excel to bring it in with the tags included:

    Sub ImportTMX()
        Dim fd As FileDialog
        Dim tmxPath As String
        Dim xmlDoc As Object
        Dim xmlNodeList As Object
        Dim tuNode As Object
        Dim tuvNode As Object
        Dim segNode As Object
        Dim row As Integer
    
        ' Create a file dialog object to select the TMX file
        Set fd = Application.FileDialog(msoFileDialogFilePicker)
        fd.Title = "Select a TMX File"
        fd.Filters.Add "TMX Files", "*.tmx"
        
        If fd.Show = -1 Then
            tmxPath = fd.SelectedItems(1)
        Else
            MsgBox "No file selected. Exiting macro."
            Exit Sub
        End If
        
        ' Create an XML DOM document
        Set xmlDoc = CreateObject("MSXML2.DOMDocument")
        xmlDoc.Async = False
        xmlDoc.Load tmxPath
    
        If xmlDoc.parseError.ErrorCode <> 0 Then
            MsgBox "Error loading TMX file: " & xmlDoc.parseError.Reason
            Exit Sub
        End If
        
        ' Clear existing content
        With Sheets("Sheet1")
            .Cells.ClearContents
            ' Add headers
            .Cells(1, 1).Value = "ID"
            .Cells(1, 2).Value = "Source language content"
            .Cells(1, 3).Value = "Target language content"
            ' Set column widths and alignment
            .Columns("B:C").ColumnWidth = 100
            .Columns("B:C").WrapText = True
            .Cells.VerticalAlignment = xlTop
            .Cells.HorizontalAlignment = xlLeft
        End With
        
        ' Initialize row
        row = 2
        
        ' Loop through each translation unit (tu)
        Set xmlNodeList = xmlDoc.SelectNodes("//tu")
        
        For Each tuNode In xmlNodeList
            Dim srcText As String, tgtText As String
            For Each tuvNode In tuNode.SelectNodes("tuv")
                Set segNode = tuvNode.SelectSingleNode("seg")
                If tuvNode.getAttribute("xml:lang") = "en-US" Then
                    srcText = InnerXml(segNode)
                ElseIf tuvNode.getAttribute("xml:lang") = "de-DE" Then
                    tgtText = InnerXml(segNode)
                End If
            Next
            
            ' Place values in the respective columns
            With Sheets("Sheet1")
                .Cells(row, 1).Value = row - 1
                .Cells(row, 2).Value = srcText
                .Cells(row, 3).Value = tgtText
            End With
            row = row + 1
        Next
        
        MsgBox "TMX file imported successfully."
    End Sub
    
    Function InnerXml(node As Object) As String
        ' This function strips the outer XML tags and returns the inner XML as a string
        InnerXml = node.XML
        InnerXml = Replace(InnerXml, "<" & node.BaseName & ">", "")
        InnerXml = Replace(InnerXml, "</" & node.BaseName & ">", "")
    End Function
    
    

    Run it and select the TMX:

    Excel spreadsheet with three columns labeled ID, Source language content, and Target language content, showing text with XML placeholder tags.

    Now I have the tags too and not just the text.  Is that what you wanted?

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub



    Generated Image Alt-Text
    [edited by: Trados AI at 9:22 AM (GMT 1) on 29 Apr 2024]
Reply Children
No Data