HTML web page translation into Arabic -How to deal with myriads of Tags

Hello

when adding these web pages - sample attached-

a tag separates every single letter

1- how to deal with this type of HTML

-2 How to Flip horizontally to match RTL BiDi Arabic format

is there any tool to flip HTML pages RTL

I searched for some solution but in vain

https://www.w3.org/International/questions/qa-html-dir

https://www.w3.org/International/questions/qa-translate-flag

http://www.ab-weblog.com/en/internationalization-how-to-localize-html5-projects/

spine-rtl-Arabic.zip

Screenshot of Trados Studio showing HTML code with a 'span' tag separating each letter, causing a formatting issue.

Thanks



Generated Image Alt-Text
[edited by: Trados AI at 12:34 PM (GMT 0) on 29 Feb 2024]
emoji
  • Hello,

    Translating HTML pages into Arabic and dealing with numerous tags can indeed be a challenge, but Trados Studio is equipped to handle this. Here's how you can approach this:

    Step 1: Import the HTML file into Trados Studio. Trados Studio is designed to recognize and handle HTML tags, so you don't need to worry about them. The software will protect the tags during translation to ensure the functionality of the web page remains intact.

    Step 2: Translate the text as you normally would. The tags should not interfere with your translation process. If you find that a tag is separating every single letter, you may need to adjust your tag settings. You can do this by going to the "Options" menu, then "File Types", and adjusting the settings under "HTML".

    Step 3: For flipping the HTML pages to match the RTL (Right to Left) BiDi Arabic format, you can use the "dir" attribute in your HTML. This attribute specifies the text direction of the content. For Arabic, you would use "dir="rtl"". This should be added to the HTML tag at the start of your document, like so: "<html dir="rtl">". This will apply the RTL direction to the entire HTML document.

    Please note that Trados Studio does not have a specific tool to flip HTML pages RTL, but the "dir" attribute in your HTML should help achieve the desired result.

    I hope this helps! If you have any other questions, feel free to ask.

    Best regards,

    RWS Community AI

    emoji
  •  

    These might be webpages but they were clearly PDFs.  I reckon the reason for the all the tags may well be the conversion from PDF because these tags are pointless.  If you look at the html itself it's full of random span tags that were put there by the conversion process.

    You will really struggle to manage this if you don't remove the tags first.  If this is a job your client has given to you then you have two sensible options I think depending on your skillsets:

    1. get the original PDF and convert that to Word.  Clean up the tags, translate and return a word file.
    2. programmatically remove the tags using something like Python "Beautiful Soup", javascript "browser's DOM" or even regular expressions... although I think this would be quite challenging looking at your files.

    Certainly I don't think attempting to translate this in the state it is would be a sensible approach.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
  • Thank you for these clarifications

    do you have any idea about Flipping the HTML to RTL, just flip horizontally, without Translation!!!!

    emoji
  •  

    do you have any idea about Flipping the HTML to RTL, just flip horizontally, without Translation!!!!

    Yes.  Just have a RTL languagefor the target, copy source to target and save the target.  For example:

    Screenshot showing English source copied to target in an Arabic target so it's all right aligned.

    This is all right aligned. Save the target:

    Screenshot of the resultant HTML, also right aligned.

    Also right aligned.

    I did nothing more than create the project English to Arabic, copied source to target and saved the target.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji