Question about character entities management in Excel files (SDL Trados)

Hello,
I am a Freelance translator and I am trying the SDL Studio Trados 2015 to see if it can help me with website translations.
Usually, my clients send me Excel files with HTML code embedded. I learnt how to filter the embedded code, but I don't know how to manage the character entities (for example á for the character á or à forthe character à). 
If I work with an HTML file, SDL Trados can support the conversion of entities (Latin 1 in our case). So if I write the character á, SDL Trados will actually write &aacute; in the HTML file (as any HTML editor). However, if I work with an Excel file, I'm only able to filter tags like <p></p> or others, but I cannot convert entities.
Is there any solution to that? Any configuration I can do?
Please let me know.
Thank you in advance.
Javier
  • Hi Javier,

    You won't be able to convert the entities but you can protect them as placeholder tags using the embedded content processor. Try this as a catchall... think it will get most:

    &\w+;

    If you want to go the other way then this won't be possible at all... you'd have to write the entities themselves.  I guess you could do that with quick inserts, or maybe using autocorrect perhaps?


    Regards

    Paul

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Thank you Paul!!
    I will try this way.
    Best regards,
    Javier
  • Hi Paul,

    I put here the solution I think will be more useful for me:

    -I use MS Excel to replace entities like &aacute;, &eacute;, etc. by their characters á, é, etc.
    -I import the Excel file in SDL Trados with your rule &\w+; This way a can see the á, é, characters in the source language but I protect other entities like &nbsp;, &laquo;, &raquo; I don't need to translate.
    -I make CTRL + Insert in order to have all the tags in the target language.
    -The use of autocorrect is a great idea but I would have entities everywhere in my target text making its reading a bit difficult. Thus, I decided to write my translation without entites.
    -Once my translation and revision is done I replace all the characters with accent of my target language by their corresponding entities.

    If you think I could improve this workflow do no hesitate to tell me.

    Best wishes

    Javier
  • Hi Paul,

    I deal with this issue quite often and converting entities to placeholders is really not a solution. First, we need to convert the source entities to their proper characters so the translator can understand the source text if it contains accented letters and then if the target text contains accented letters as well, we need to convert those to their respective entities as well, which so far we have to do manually.

    For example:

    translating the word Spain from Spanish to Czech

    With accented characters: España -> Španělsko

    With HTML entities: Espa&ntilde;a -> &Scaron;pan&ecaron;lsko

    There are different entities in source and target so converting "ñ" into a tag to transfer it safely into Czech wouldn't make much sense.

    It does make sense when dealing with non-breaking spaces, symbols and other characters which don't change during translation, though.

    Could there be some improvement regarding this in some new versions of Trados?

    Thanks,

    Vojtech

  • You can handle that very easily - export the Excel content to XML and then use "XML with embedded content" file type and deal with everything (incl. entities conversion) in the embedded HTML processor.

    Exporting to XML can be easily done e.g. using this simple method: http://www.excel-easy.com/examples/xml.html

    Of course, a definitive solution is to teach the customer that entities are TOTALLY obsolete in 21st century and today's Unicode world!