PDF Issue: Layout of Translated File & Recognition of Source Text

Former Member
Former Member

Hi all,

I am trying to translate an original (editable) PDF on Trados from English into Greek. However, two major issues arise. First, many of the characters of the source text are not recognized properly in Trados when the file is imported. As a result, the source text on my Editor appears as gibberish in many segments. For example, the character x is replaced with - while w is replaced with x.

Screenshot of Trados Studio Editor showing text recognition errors, with characters like 'x' replaced with '-' and 'w' replaced with 'x'.

An example of bad text recognition

Second, when I generate the translation, the translated file is totally ruined in terms of layout. For example, much of the text of the 1st page has been moved to the 2nd page, the segmentation has been ruined, and the spacing is terrible (see images below).

Screenshot of the original PDF layout with proper text alignment and spacing on the first page.

The layout of the source text

Screenshot of the translated PDF layout on the first page with text misalignment and spacing issues.

The layout of the target file (1st page)

Screenshot of the translated PDF layout on the second page showing text overflow from the first page and disrupted segmentation.

The layout of the target file (2nd page)

In general, I'm always facing layout issues when it comes to translating PDF files. What would be your suggestion on preventing these issues, if possible?

Kind regards,

Christos



Generated Image Alt-Text
[edited by: Trados AI at 8:39 PM (GMT 0) on 28 Feb 2024]
emoji
Parents
  • I must disappoint you - there is no way to get it better, if you do not invest any work. And you have to invest this work upfront. Even if it is possible to translate a PDF with a CAT tool directly, this is a very bad idea. You have already learned why - the conversion is as it is. You have no influence on what is being converted how.

    Either insist on translating the native format of the document before it was PDF or use a decent OCR and convert the PDF manually. Then pay attention, that the fonts used do cover your target language. Expecting any automated tool to be able to provide you perfect conversion quality is - forgive my French - at least naive.

    If you want to learn more about Studio and PDF, watch this upcoming webinar: http://seminare.bdue.de/4705

    BTW, the term "editable" PDF is very misleading. No PDF is "editable", as the format has been entirely developed for READ-ONLY applications. It is not intended to be edited in any way. So what you mean is a "clickable" PDF, where you can click and select text. If the PDF is not protected, you can also copy the text. The best idea would be simply to copy all the text into a notepad to remove all formatting, translate this, apply the basic formatting like headings, list elements and so on and deliver this to the customer to make his layouter copy & paste it, if the customer will not deliver the original source file. Or to reformat it yourself.

    _________________________________________________________

    When asking for help here, please be as accurate as possible. Please always remember to give the exact version of product used and all possible error messages received. The better you describe your problem, the better help you will get.

    Want to learn more about Trados Studio? Visit the Community Hub. Have a good idea to make Trados Studio better? Publish it here.

  • Fun fact: Looking closer at the PDF screenshot I realized (only after posting my answer) that the "customer" is actually Christos himself ;-)

Reply Children
No Data