How to achieve document-level or project-level context for MT and AI

Daniel Hug 5 months ago

I am looking for a way to achieve at least document-level context for MT and AI. The current solution is to run an AI platform in parallel and copy/paste from Trados Studio to the AI platform. There, and agent is equipped with the entire document (and reference material, if there is). I then manually paste the response back into Studio. So far, so good, but this is painfully manual and very slow.

I'd like to be able to tell Studio's AI assistant:

“Include the preceding X segments (or paragraphs) in your request. With (or without) translation.”

“Include the succeeding X segments (or paragraphs) in your request. (With or without translation, for the sake of completion, although this will usually be without unless I am in the role of the reviewer.)

I created an idea for this, please support: Context-awareness for AI Assistent

For terminology, very, very important: Include such-and-such a field in your request. This is how I can tell AI not to use TB entries with the status “deprecated” or “superseded”. There is an idea for this already, please support: OpenAI Provider for Trados Studio: option to include term information in system prompt

A lot happened recently with the AI Assistant (user can modify the system prompt)! Thank you for that!

Removed AI Suggestion
[edited by: Daniel Hug at 10:17 AM (GMT 0) on 13 Dec 2025]

Translate

Rate translation

Suggest better translation

Moderator UI

Thread Subject & Description
How to achieve document-level or project-level context for MT and AI I am looking for a way to achieve at least document-level context for MT and AI. The current solution is to run an AI platform in parallel and copy/paste from Trados Studio to the AI platform. There, and agent is equipped with the entire document (and reference material, if there is). I then manually paste the response back into Studio. So far, so good, but this is painfully manual and very slow. I'd like to be able to tell Studio's AI assistant: “Include the preceding X segments (or paragraphs) in your request. With (or without) translation.” “Include the succeeding X segments (or paragraphs) in your request. (With or without translation, for the sake of completion, although this will usually be without unless I am in the role of the reviewer.) I created an idea for this, please support: https://community.rws.com/ideas/trados-portfolio-ideas/i/rws-appstore-ideas/context-awareness-for-ai-assistent For terminology, very, very important: Include such-and-such a field in your request. This is how I can tell AI not to use TB entries with the status “deprecated” or “superseded”. There is an idea for this already, please support: https://community.rws.com/ideas/trados-portfolio-ideas/i/rws-appstore-ideas/openai-provider-for-trados-studio-option-to-include-term-information-in-system-prompt A lot happened recently with the AI Assistant (user can modify the system prompt)! Thank you for that!
Get AI Suggestion

AI Reply

Accept answer Reject Answer

Top Replies

0 Daniel Hug 5 months ago

For MT, I remember Globalese (now bought up by MemoQ) offered only “asynchronous” MT – you had to send the whole document to their MT engine, wait some, and get the whole document back (all target segments filled in). I was thinking along these lines when thinking about context-aware MT. It could store the segments in a TM.

I can do this already, manually: Get the whole document translated by MT (or AI), perform an alignment (works really well with AI), create a TM and populate it with the TUs resulting from the alignment. It's very possible already, just very slow. I could write a script to speed up parts of it, but I think it would be timely for any translation tool to offer as out-of-the-box functionality. Could be done as part of the core product or as an app.
Cancel
Vote Up +1 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Joost Buysschaert 5 months ago

Just to say that I expressed the very same idea during a talk in Luxembourg in October, recommending it as the way forward. But I have no further ideas on its implementation (other than the slow way that you have outlined).
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Dariusz Adamczak 5 months ago in reply to Joost Buysschaert

Daniel Hug (DanielHug6562 Joost Buysschaert

Hi,
Unfortunately, I can't help you with AI Assistant, and passing the context of the document and termbase to the prompt can be difficult in general.
However, if you want to export the entire text and terms from the termbase in the document, and then import the AI translation, you can check out the TransAIde plugin. This plugin does just do that.

https://posteditacat.xyz/en/

https://www.youtube.com/watch?v=VbW-YH-yaw4&t=6s

https://appstore.rws.com/Plugin/414

Dariusz Adamczak (posteditacat.xyz)
Cancel
Vote Up +1 Vote Down

Sign in to reply

Verify Answer

Reject Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Daniel Hug 4 months ago in reply to Dariusz Adamczak

Thank you Dariusz Adamczak ,

Your solution makes a lot of sense, but (cc:Daniel Brockmann ) I notice there are a lot of attempts at the moment to export content from Trados Studio in order to translate it within context using AI or MT systems, then re-import it into Trados. Michael Beijer's “Supervertaler” is another variation of the theme. I have been tinkering around with the XLIFF export function for the same purpose for almost a year now.

I think the message is clear: While there is a lot of utility in segmentation still, the time for context has arrived. This is a functionality that CAT systems should provide natively – and will, I am sure. The market will dictate it. The competitive advantage of being able to do so is overwhelming.

I am currently using Trados to translate with MT (more reliable than AI), export to XLIFF, hand the whole file over to an AI agent and re-import the translations. Then I work in Trados to do the QA steps or send projects off to co-workers. So while Trados is still the hub of my translation tech stack (file conversions/ file types!), the actual translation happens more and more outside of it. I wish it could move back in.
Cancel
Vote Up +1 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Dariusz Adamczak 4 months ago in reply to Daniel Hug

Daniel Hug

I agree with you on many points.

Document-level context vs. segment-level context:
Once someone has tried using AI for translation with document-level context, they will never go back to segment translation. And the CAT industry must take this into account because the first solutions will reap the biggest rewards. However, this is not a trivial problem. Personally, I also would prefer to use Trados only for importing source files, as a QA tool, and for generating target documents. (https://posteditacat.xyz/beyond-segments-the-critical-role-of-context-in-modern-translation/).
I think there are many translators who work this way or at least copy text from AI to their CAT tools, because they don't know that there are already tools for this purpose. I only now just found out about Michael Beijer Supervertaler for example.

XLIFF process:
My attempts to translate large XLIFF files with AI have not been very successful. XLIFF files are simply too large to be processed efficiently by AI. We could process them segment by segment, but then we wouldn't be able to retain the context of entire documents. Do you have any ideas for this?
For projects with a lot of exact and fuzzy segments, I develop a compact JSON format containing only the necessary data (source, existing target, segment identification, and status), and it works surprisingly well when updating large projects.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Joost Buysschaert 4 months ago in reply to Dariusz Adamczak

Let me reveal the nature of my own experiments (with small files - medical guidelines - of about 300 words only).
STEP1 I upload my source text and a glossary to NotebookLM and ask for a translation that respects the glossary. This yields a text-based translation.
STEP2 I turn source and target into a tmx-file using LF Aligner.
STEP3 I start up a project in Studio using the tmx and run through the text, revising where needed.

My latest experiment shows a few worthwhile improvements with this method as compared with an earlier translation in Studio (so in segmentation mode) of the same text, with Chat GPT as an engine. Two examples of improvements:
(1) in the earlier translation the title was rendered in a shortened version, which would be okay in the body of the text but not as a title. The NotebookLM translation (text-based) didn't make this error.
(2) in the earlier translation a feminine French term was referred to with the masculine pronoun "il" in a subsequent sentence; the NotebookLM version correctly chose "elle".
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Michael Beijer 3 months ago in reply to Dariusz Adamczak
Hi guys,

Supervertaler has been undergoing extremely rapid development (thanks to Claude Code). I am even starting to get .sdlppx/.sdlrpx support working reliably!

See: https://supervertaler.com/changelog

Indeed, Supervertaler will soon even be signed and notarized on macOS and I currently already have ready-to-run Windows EXE and macOS DMGs in the latest releases on GitHub!

https://github.com/michaelbeijer/Supervertaler/releases/tag/v1.9.276

In the beginning, I spent most of my time exporting bilingual files from memoQ or Trados and then pre-translating them in Supervertaler with AI (while creating the prompt on chatgpt.com), but the CAT tool in Supervertaler has now gotten so good that I prefer to do the actual human translation (post-tweaking, whatever you call it these days) in Supervertaler as well. Claude Code basically allows me to implement new ideas in mere hours that would take a proper team weeks!

For example, I always loved the novel terminology display system I first encountered in the RYS plugin for Trados (then called "RyS Termbase & Translation Assembler"; now called "RYSTUDIO Post-editing Package"), so I implemented a version of it in Supertranslator. The whole process was completed in a day!

See: community.rws.com/.../the-sad-sad-state-of-trados-studio-s-useless-terminology-tools

Michael

- https://appstore.rws.com/Plugin/135 (RYSTUDIO Post-editing Package)

- http://www.rycat.cn/

- www.rycat.cn/.../trados-postediting.html
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Jesús Prieto 3 months ago in reply to Dariusz Adamczak

Dariusz Adamczak

Dariusz Adamczak said:
XLIFF files are simply too large to be processed efficiently by AI

Not sure if I understand this. Do you send the full XLIFF file? Have you tried stripping the <internal-file> content first ? (it can be huge if the original file had images)?
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Michael Beijer 2 months ago in reply to Jesús Prieto

Hi guys! Supervertaler now has an "Okapi sidecar" — a lightweight Java microservice that runs quietly in the background and handles monolingual file imports and exports using the industry-standard Okapi Framework file filters.

See: https://github.com/michaelbeijer/Supervertaler/releases/tag/v1.9.342 + - https://supervertaler.com/changelog (see: v1.9.342)

What is the Okapi Framework?

The Okapi Framework is the same open-source localisation toolkit used under the hood by various professional translation tools. It contains thoroughly battle-tested file filters for dozens of formats — DOCX, XLSX, PPTX, HTML, XML, IDML, and many more — with proper handling of inline formatting, segmentation, and round-trip fidelity.

What does "sidecar" mean?

Since Okapi is written in Java and Supervertaler is a Python/Qt application, they can't talk to each other directly. The sidecar is a small Java process that starts automatically in the background when needed. Supervertaler communicates with it over a local REST API — sending files to be extracted into translatable segments, and sending translations back to be merged into a properly formatted output file. You never have to interact with it; it just works behind the scenes.

What does this mean in practice?

The previous system used a fully Python-based DOCX importer, which worked reasonably well but struggled with more complex formatting. The Okapi-powered system produces exported files that are exact replicas of the original in terms of formatting and layout — bold, italic, colored text, heading styles, fonts, lists — everything comes through faithfully. It also brings proper SRX segmentation, better paragraph detection, and semantic inline formatting tags (like <b> for bold) that are visible while you translate.

The new system can already be tested in the latest Windows builds available via pip. I'm also working on a Windows EXE release and a Mac DMG.
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Paul Filkin 2 months ago in reply to Michael Beijer
Michael Beijer

Just wanted to add to this:

Michael Beijer said:
The Okapi Framework is the same open-source localisation toolkit used under the hood by various professional translation tools. It contains thoroughly battle-tested file filters for dozens of formats — DOCX, XLSX, PPTX, HTML, XML, IDML, and many more — with proper handling of inline formatting, segmentation, and round-trip fidelity.

Okapi is indeed open-source, well-established, and provides file filters for a wide range of formats. It's been around since the early 2000s and may well be used as plumbing inside some translation tools. I think that in the early days of Phrase (memsource at the time) they might have based their solution on Okapi - but I'd be surprised if this is still the case today. Whilst the range of supported formats is impressive, and the framework does handle segmentation (via SRX) and inline codes reasonably well, you might be stretching its capabilities a little :-)

"Battle-tested" is doing some heavy lifting. The filters vary in quality. Some (like HTML, XLIFF, PO) are likely fairly solid. Others (like IDML) have known quirks and limitations. The DOCX filter, for instance, handles the basics well but can stumble on more complex documents with nested content controls, tracked changes, or unusual formatting.

"Round-trip fidelity" is aspirational rather than universal. For simpler files it's probably fine, but edge cases in formats like IDML or heavily styled DOCX can and do break. Anyone working seriously with localisation file filters knows that perfect round-tripping is the hardest part, and Okapi doesn't magically solve that.

"Used under the hood by various professional translation tools" - maybe, but not as widespread as the statement implies. Many major tools (Trados, memoQ, Across) use their own proprietary filters rather than Okapi's.

Also, I got your message to my alter-ego, but this is probably better directed to here. Everything you need to get started working with the APIs can be found with RWS resources. So a good place to start is here:

https://developers.rws.com/

You'll find the SDKs and API documentation for almost every product we have here. Certainly the Trados portfolio is well covered. For Trados Studio plugins specifically a good starting point is here:

https://developers.rws.com/studio-api-docs/articles/gettingstarted/studio_plugin_overview.html

And for technical questions this forum is a must:

Trados Studio Developers

178 members

Mark Owens posted Trados 2024 API problem only affecting Simplified Chinese projects in Trados Studio Developers > Studio Developers Q&A. 4 hours ago

And lastly, a great learning resource is the open-source plugins... much of the work the Trados AppStore team has done is open-source and they share it here:

https://github.com/RWS/Sdl-Community

Hope that helps?

Paul Filkin | RWS

Design your own training!
You've done the courses and still need to go a little further, or still not clear?
Tell us what you need in our Community Solutions Hub
Cancel
Vote Up +1 Vote Down

Sign in to reply

Verify Answer

Reject Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate

Trados Studio > 1. Trados Studio

How to achieve document-level or project-level context for MT and AI

Top Replies