Monolingual Term Extraction - Requesting Advice

Hi,

I am a beginner looking to extract a list of monolingual (English) term candidates from a large bilingual translation memory file using Trados Studio, MultiTerm, and/or Term Extract. I am interested in extracting terms based on frequency and would greatly appreciate any guidance on how to approach this task, as I am encountering some difficulties understanding the documentation and getting stuck due to my limited experience.


Thanks and kind regards



Fixed typo in title
[edited by: FPH at 11:03 AM (GMT 0) on 19 Nov 2024]
emoji
Parents
  • I exported the TM as a TMX file but the import options in Multiterm are greyed out. I am using Trados Studio 2024 (18.0.1.2259) and Multiterm 2024 (18.0.2.3266).

    emoji
  •  

    I'm afraid the AI is incorrect here.  I have corrected the reason for it's poor response so it won't do that again, and the answer it likely provides to your question now is this one:

    "I understand that you are looking to extract a list of monolingual term candidates from a bilingual translation memory file using Trados Studio, MultiTerm, and/or Term Extract and you're interested in doing it based on frequency. It's great that you're started using these tools, and I'm here to help guide you through the process. The tool you need for this task is MultiTerm Extract. It uses statistical extraction to determine the frequency of appearance of term candidates, which makes it perfect for your needs. From your large bilingual documents and translation memories, MultiTerm Extract will draw out term candidates and present them in a candidate list for validation. Although Trados Studio and MultiTerm themselves do not support term extraction, MultiTerm Extract is designed specifically for this job and can be easily integrated with other SDL tools. The tool enables incorporation of these terms into existing MultiTerm Extract termbases or exportation to MultiTerm XML format or tab-delimited format. It also supports checking for terminology consistency between the termbase and a translated file. The type of project you want to create for this task is a Monolingual Term Extraction project. File formats supported for a Monolingual Term Extraction project in MultiTerm Extract are: TXT – Plain text, RTF – Rich text format, DOC – Word documents, HTML, HTM, JSP, ASP, ASPX, SGML, SGM, XML, Ventura (*.txt), PageMaker (*.txt), QuarkXPress (*. qsc , *.xtg, *.ttg, *.tag), TTX – TRADOStag file format, InDesign (*.isc), Powerpoint (*.ppt, *.pps, *.pot ), Excel (*.xls, *.xlt). Remember that before you begin, you need to obtain a separate license to fully use SDL MultiTerm Extract. I hope this information is helpful. It's wonderful that you're keen on making the most of SDL's powerful linguistic tools, and I'm confident that with a bit of practice, their operation will become second nature. I wish you the best with your term extraction project, and should you have any more questions, don't hesitate to ask!"

    The important take from this is that a TMX is not a supported format for monlingual term extraction according to the help.  So I would suggest you start by converting your TMX into something like Excel (easy with the Glossary Converter) and then base your project on the English candidates only of your Excel.

    Hopefully that will work better?

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
  • Hi Paul,

    Thank you for the feedback. 

    I tried what you suggested and as it stands, I open the bilingual TM in Trados Studio and export it as a TMX (Translation Memory eXchange) file. I then open that TMX in Glossary Converter and export it as an Excel file.

    However, selecting any of the Excel output options (2004, 2007 Workbook, TXT, or CSV) results in a '.gcsettingsxml' file being exported instead.

    Best regards

    emoji
Reply Children
No Data