XML files: adding embedded HTML removes automatic length restriction

Dear Community,

Our client sent us amazon product texts in an xml file with HTML tags to work with, and we'd like to them use instead of the (tiresome) Excel spreadsheets we've been working with.

We were able to prepare them well enough using a custom XML file type, but we encountered one problem: if we don't use the embedded HTML content processor, tagging and entities are displayed like this:

And if we do enable it, the tags are prepared and excluded perfectly, but the length restrictions (as seen in the orange-coloured DSI in the image above) disappear:

I've tried using (multiple) RegEx in the XML file type by adding custom DSI to the parser, but to no avail, it would be as if I hadn't added anything at all.

I have to admit I'm new with the XML file type and am unsure whether I added this right in the parser/DSI menu, but even if I did, that would still leave the entity problem to deal with.

Does anyone know how to convert the HTML tags an entities all the while keeping the length restrictions from the XML file type?

Thank you in advance!

Guillaume

emoji
Parents
  •  

    Did you ever find a way forward to progress this?
    I work best having your files and if you are able to do so,I maybe able to explore all options

    But for now this is what I am wondering: 

    I know you mentioned Excel files but we do have a very good multi-lingal excel file type equally we have a multi-lingual XML file type, both of which can be downloaded from the Trados AppStore.
    I mention alternative file types as these  both support embedded content processing and context / length checks and I assume from your post that your findings are using the default standard file types.

    Secondly to explore out the box features that help with length restrictions and regardless of your parser rules, have you tried to implement this

    Trados Studio Project Settings window showing Length Verification options with an error icon next to 'Check the following contexts only' field.

    Regards

    Lyds

    Oana Nagy | RWS Group

    _____________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 10:48 AM (GMT 0) on 29 Feb 2024]
Reply
  •  

    Did you ever find a way forward to progress this?
    I work best having your files and if you are able to do so,I maybe able to explore all options

    But for now this is what I am wondering: 

    I know you mentioned Excel files but we do have a very good multi-lingal excel file type equally we have a multi-lingual XML file type, both of which can be downloaded from the Trados AppStore.
    I mention alternative file types as these  both support embedded content processing and context / length checks and I assume from your post that your findings are using the default standard file types.

    Secondly to explore out the box features that help with length restrictions and regardless of your parser rules, have you tried to implement this

    Trados Studio Project Settings window showing Length Verification options with an error icon next to 'Check the following contexts only' field.

    Regards

    Lyds

    Oana Nagy | RWS Group

    _____________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 10:48 AM (GMT 0) on 29 Feb 2024]
Children
  •   

    Hi Lydia, thank you for taking a look, I didn't think this thread would still be somewhere visible. Slight smile

    I never really got further than what I described back then:

    - Custom XML file type

    - Length restrictions preserved in the editor through parser rules

    - HTML-tagging processed through embedded content processing based on DSI

    I couldn't solve the entity problem until now. The file type's entity settings don't seem to touch them when they're inside of HTML tags:

    Trados Studio project settings window showing Entity conversion options with a list of entity names, characters, and Unicode values.

    As for your other suggestions:

    - I love the multilingual Excel file type but it doesn't work in his case: the client's Excel file contains multiple columns per language, which is custom for Amazon projects.
    (e.g. German title - English title - French title - German bullet point 1- English bullet point 1- French bullet point 1 etc.)

    - I've also tried playing around with the multilingual XML file after I read some of Paul's articles on XPath, but I'm still struggling to make it work here. I think this XML structure doesn't fit the concept, as I need to exclude some untranslatable paths that are on the same level as the translatable ones.

    - As for the length verification checker, it can only check one specific length restriction, but there are different ones depending on the module. That problem I was able to solve using the parser rules, but they disappear as soon as I activate one of the embedded content processor file types.

    Unfortunately I can't send you the client's file itself, so I understand it's tough to really help me out with this. I was hoping I might come across someone with a similar problem and a solution that works for mine.

    All best,

    Guillaume

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 10:48 AM (GMT 0) on 29 Feb 2024]