Entity conversion in embedded content

Hi

I have some problems getting entity conversion in embedded content processing in SDL Trados Studio 2015 to work the way I'd like it to work...

I'd like Studio to interpret the character entities it finds (reader settings) and show that in the editor, but I'd like Studio to output the actual characters in the output (writer settings).

In some cases, Studio even double escapes things.

 

Input?

Parents
  • Hehe, we've been discussing this with Patrik the other day extensively ;-) There are two parts of the problem...
    First, you need to carefully think about what you are actually (un)escaping/converting... e.g. if you have HTML embedded inside an XML and you do the entities conversion during load at the XML level already, the content goes into the embedded HTML parser already unescaped... and vice versa during the save. So this is where your double-escaping may be coming from...
    And second, the conversion itself... If the settings would be working as actually intended, you would never be able to achieve what you want, because the SDL's generic rule is "whatever format comes in, must also go out"... i.e. if you have source files with entities, it is assumed that they are there for a reason (heh, this USED to be true long time ago in the times when clients actually KNEW what they are doing...) and therefore it is expected that they MUST be in the targets as well. It has never been intended to be able to select different options for "load" and "save" separately".
    BUT! ;-) There is a loooooong-standing "bug" in Studio across all versions - you can enable the conversion as such (enable the listbox with individual checkboxes) BUT UNCHECK ALL CHECKBOXES... ;-) And that combination actually does what you want - it converts the entities to characters during LOAD, but preserves the actual characters during save! ;-)

    We have discussed with Patrik how to get the best of both worlds - fixing this "bug", but leaving the option to select different behavior for load and save... didn't come to a reasonable solution since it would complicate the settings in the first place, and there are different ways how to organize the more complex settings page... which would be better looked at by some proper UX designer.

    The point of all this is that these days we get basically always only files CRIPPLED in million different ways, including these entities all over the place NOT for a good reason, but just because the software producing the files was written by some LAME developer who knows sh*t about computers :-\... and the 'manager' sending the files over knows about computers just "oh, it's that Facebook, Instagram and stuff, isn't it?", so asking for a fix is totally pointless...
    So we engineers need the flexibility.
Reply
  • Hehe, we've been discussing this with Patrik the other day extensively ;-) There are two parts of the problem...
    First, you need to carefully think about what you are actually (un)escaping/converting... e.g. if you have HTML embedded inside an XML and you do the entities conversion during load at the XML level already, the content goes into the embedded HTML parser already unescaped... and vice versa during the save. So this is where your double-escaping may be coming from...
    And second, the conversion itself... If the settings would be working as actually intended, you would never be able to achieve what you want, because the SDL's generic rule is "whatever format comes in, must also go out"... i.e. if you have source files with entities, it is assumed that they are there for a reason (heh, this USED to be true long time ago in the times when clients actually KNEW what they are doing...) and therefore it is expected that they MUST be in the targets as well. It has never been intended to be able to select different options for "load" and "save" separately".
    BUT! ;-) There is a loooooong-standing "bug" in Studio across all versions - you can enable the conversion as such (enable the listbox with individual checkboxes) BUT UNCHECK ALL CHECKBOXES... ;-) And that combination actually does what you want - it converts the entities to characters during LOAD, but preserves the actual characters during save! ;-)

    We have discussed with Patrik how to get the best of both worlds - fixing this "bug", but leaving the option to select different behavior for load and save... didn't come to a reasonable solution since it would complicate the settings in the first place, and there are different ways how to organize the more complex settings page... which would be better looked at by some proper UX designer.

    The point of all this is that these days we get basically always only files CRIPPLED in million different ways, including these entities all over the place NOT for a good reason, but just because the software producing the files was written by some LAME developer who knows sh*t about computers :-\... and the 'manager' sending the files over knows about computers just "oh, it's that Facebook, Instagram and stuff, isn't it?", so asking for a fix is totally pointless...
    So we engineers need the flexibility.
Children
No Data