Escaped Embedded Content in XML File

Dear all,

I'm trying to get the contents of a huge XML file into Studio 2017 (latest version). The structure is rather simple:

<?xml version="1.0" encoding="UTF-8"?>
<surveyText surveyId="SV_cTPOOTkcoKOkkvz">
    <QID176_QuestionText>&lt;p&gt;Welcome.&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;This is just an example.&amp;nbsp; There is lots of text.&lt;/p&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;div&gt;But also lots of tags within the elements.&lt;/div&gt;</QID176_QuestionText>
    <QID644_QuestionText>&lt;div&gt;This is a second question.&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br&gt;&lt;/div&gt;&lt;div&gt;How can I convert the escaped HTML tags into Studio tags?&lt;/div&gt;</QID644_QuestionText>
    <QID643_Choice1>Choice 1</QID643_Choice1>
    <QID643_Choice2>Choice 2</QID643_Choice2>
</surveyText>

Getting the element content into Studio is not a problem. However, even though I created a new XML file type with Embedded Content Processing and the entity conversion enabled, what I get to see in the Editor is this:

<p>Welcome.&nbsp;</p><p><br></p><p>This is just an example.&nbsp; There is lots of text.</p><p><br></p><div>But also lots of tags within the elements.

Is it possible somehow to to let Studio know that "&lt;p&gt;" in the original XML file should be treated as a paragraph tag and not as "<p>" as literal text?

Thanks so much for your help!
Holger

Parents Reply
  • Unknown said:
    Does anybody have an idea how to persuade Studio to treat the whole file as ONE thing with individual segments instead of thousands of individual XML files?

    This is currently not possible.
    Each part passed to the embedded content processor is treated internally as separate part in the SDLXLIFF, hence the separators.

    If there are parts of your XML which will not contain embedded HTML (I know you said all of them can, but this is often not entirely the case - often only some of the XML elements really do/can contain HTML and the remaining majority do/can contain only plain text), changing your XML parsing rules and passing only the really relevant content to the HTML embedded content processor can reduce the amount of the "orange clutter" in the editor.

Children