Announcing change of PDF conversion technology in Trados Cloud Offerings (Trados Studio's Cloud Capabilities, Trados Team, Trados Accelerate/Enterprise)

From Monday 13th March 2023 onwards, the Trados cloud platform (Studio cloud capabilities, Trados Team, Trados Accelerate/Enterprise) will start using a new mechanism and underlying technology to convert PDF files to translatable format in translation projects. Trados Studio also uses the new mechanism and underlying technology for processing PDF files in translation projects starting from cumulative update 6.

Our stance and recommendation is still to use the original file format wherever possible, PDF should always be seen as a workaround to the situation where the original file cannot be made available. PDF does not lend itself readily to localization and creates a lot of extra effort during the process.

This new technology provides PDF support similar to the previous vendor and, overall, you should get similar results when working with PDF files in the cloud or Trados Studio. However, the new PDF file type converts PDF project files into translatable format slightly differently, which can lead to the following differences when compared to previous Trados Studio versions:

  • Differences in analysis statistics.
  • Differences in TU lookup results due to changes in how PDF text formatting and segmentation is now handled.
  • Differences in how images are recovered when generating the translated PDF files.
  • Differences in how special characters and symbols are processed. If you use Asian languages or other non-Latin based languages as source languages, we recommend that you enable the new Use alternative processing (better for non-Latin based languages) option in the PDF file type settings.
  • Differences in PerfectMatch. If you ran PerfectMatch on a PDF file converted with the previous file type and now run PerfectMatch on a new bilingual file using the new file type, not all PerfectMatch segments may be transferred to the new file. To work around this, you can use the setting "Ignore formatting" in the PerfectMatch settings.

Conversion settings for new and existing projects

Your existing PDF file-type configuration will continue to work. However, the Conversion settings for new PDF-based projects will change as follows:

  • Layout - remains and existing setting will be remembered. 
  • Headers and footers - no longer available, will always be extracted now
  • Detect tables - no longer available, will always be extracted now
  • Image recovery - no longer available, images are kept, but no text is extracted
  • Recognize PDF text - no longer available, images are kept, but no text is extracted
  • New setting Use alternative processing (better for non-Latin based languages)

Support for scanned PDF documents

Support for scanned PDF documents using OCR (optical character recognition) is limited out of the box.

If a PDF file contains merely a scanned picture of the underlying document, then the new technology will not be able to convert the document. If, on the other hand, the document is scanned but the text in it is selectable, then the technology will attempt to convert the characters within the document.

You can test this in Adobe Reader, for example. If it's possible to select any text in the document, then the technology should be able to attempt to convert it.

The IRIS app is no longer supported as a complement for the PDF file type. We recommend that you use the new PDF Assistant for Trados Studio app for optimized support of scanned PDF documents regardless of source language (see below for details). 

Alternative approaches

If you need more advanced support for scanned PDF documents, we recommend the following options:

  • Install the new PDF Assistant for Trados Studio app. This is a new, free app, that we have developed especially for this change. It uses a new and sophisticated approach to PDF conversion and is available from within Trados Studio > Add-Ins tab >  RWS AppStore, and from the RWS AppStore website
    In this first release, PDF Assistant for Trados Studio uses Microsoft Word behind the scenes to perform PDF to DOCX conversion. It uses Word's rich capabilities to handle scanned documents and documents in a variety of languages, including bidirectional and Asian.
  • Use Microsoft Word’s built-in PDF. This accepts PDF files, including OCRed, for opening files and can save them out in Word .DOCX format which you can then process as usual.
  • Use Adobe Reader built-in function to save PDF documents in Microsoft Word format. This option can be purchased as a subscription. 
  • Check out third-party solutions, such as Abbyy Fine Reader or Readiris. These can convert OCRed PDF documents to Microsoft Word format. These solutions are available as perpetual licenses or on subscription.

What’s next?

We will keep updating and refining the new PDF Assistant for Trados Studio app to give you the best possible PDF conversion capabilities. We have developed this app with extensibility in mind, so in future updates, we may integrate other conversion providers into the app. For more information, see the PDF Assistant for Trados Wiki

Besides updating the app, we are committed to continuing to improving PDF support with future updates and are in constant touch with our new vendor around this. While we are transitioning to the new technology, we are keen to get your feedback around this change via the Trados Studio user community.

Trados Product Management

  • Apparently this exception (""Exception: Retrieving the COM class factory for component with CLSID {000209FF-0000-0000-C000-000000000046} failed due to the following error: 80080005 Server execution failed (Exception from HRESULT: 0x80080005 (CO_E_SERVER_EXEC_FAILURE))." has to do with Microsoft Office 365 update. I managed to use the PDF Assistant for Trados after I have run "repair" for Microsoft Office suite.

    So, Paul, the PDF Assistant is an option and it works, at least for me it did.

    Thank you for your support

  • I have to say that I am also very disappointed with this PDF change policy because the Solid plugIn worked much better for us, at least it was able to process scanned documents.

    I tried the PDF Assistant PlugIn today with a small document and with a large document of scanned PDFs and the results of PDF Assistant were the same as using MS Word: approx. 70% of text processed and 30% of chunks of images with text... (a DTP nightmare).

    I was still able to try these two documents in a Trados Studio licence that had not been updated to this last version and the result of Solid was much better than Pdf Assistant or MS Word. I was actually glad that my co-worker ignored my request some weeks ago to update all licenses. We do use Abby for some projects that require hard DTP/Graphic Design work but for an everyday routine in which we receive translation requests in paper (certified translations) we suddenly lost the ability to give a quick word count/quotation to our clients with this new update of Trados Studio.

  • Hello, Paul!

    "The old method converted to a DOCX anyway and you'd have to tody up the target file to make sure it was presentable.  So converting to a DOCX, tidying that up and translating the DOCX is probably a more sensible workflow anyway since you won't have to deal with poor conversions in the translation." - yes, there is a more sensible workflow, but at least you can convert the file from a non-editable format to an editable format and this helps a lot when dealing with files that are not editable.

    However, I have just installed PDF Assistant for Trados and unfortunately I cannot make it work. I have tried with several pdf files and I keep on receiving the same message:

    "Exception: Retrieving the COM class factory for component with CLSID {000209FF-0000-0000-C000-000000000046} failed due to the following error: 80080005 Server execution failed (Exception from HRESULT: 0x80080005 (CO_E_SERVER_EXEC_FAILURE))."

    I have the latest version of Trados Studio 2022 (Trados Studio 2022 - 17.0.6.14902). 

    Is there any solution to make it work? Or it has some issues for the moment? Thank you!

  • Thanks   - I have now updated the announcement above to include more comprehensive information and include more details around the PDF Assistant which was not available at the time when we first announced the change.

  •  

    Perhaps this plugin is what you're after?

     PDF Assistant for Trados 

    The old method converted to a DOCX anyway and you'd have to tody up the target file to make sure it was presentable.  So converting to a DOCX, tidying that up and translating the DOCX is probably a more sensible workflow anyway since you won't have to deal with poor conversions in the translation.