How to mark escaped HTML content not enclosed by CDATA tags in XML files for processing with the embedded HTML processor?

I'm trying to create a custom file type for .opf files, which are XML files in .epub files used for metatadata entries.
A greatly simplified version of this file type looks like this:

<?xml version="1.0" encoding="utf-8"?>
<package version="2.0" unique-identifier="BookId" xmlns="">www.idpf.org/.../opf">
  <metadata xmlns:dc="">purl.org/.../" xmlns:opf="">www.idpf.org/.../opf">
    <dc:title>The book of the year</dc:title>
    <dc:creator opf:file-as="Doe, John" opf:role="aut">John Doe</dc:creator>
    <dc:language>en</dc:language>
    <dc:identifier id="BookId" opf:scheme="UUID">urn:uuid:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx</dc:identifier>
    <dc:description>&lt;p&gt;The &lt;strong&gt;greatest&lt;/strong&gt; book &lt;em&gt;ever&lt;/em&gt; written.&lt;/p&gt;</dc:description>
  </metadata>
  <manifest>
    <item id="Section0001.xhtml" href="Text/Section0001.xhtml" media-type="application/xhtml+xml"/>
    <item id="ncx" href="toc.ncx" media-type="application/x-dtbncx+xml"/>
  </manifest>
  <spine toc="ncx">
    <itemref idref="Section0001.xhtml"/>
  </spine>
</package>

Note that the <dc:description> tag contains escaped HTML tags:

Screenshot of Trados Studio interface showing two columns with XML content. Left column highlights 'The book of the year' and escaped HTML tags in 'dc:description'. Right column is identical but without highlights.

Selecting the tags that should and shouldn't be translated via XPATH is pretty straightforward.

I used:

//dc:title
//dc:description

However, I can figure out how to tell Studio to select the <dc:description> tag for processing with them embedded HTML processor because I couldn't find any documentation on how to define a custom structure and mark it for embedded content processing.

I'd greatly appreciate some pointers on how to do this.



Generated Image Alt-Text
[edited by: Trados AI at 6:04 AM (GMT 0) on 29 Feb 2024]
emoji
Parents Reply Children