Extract hyperlink content from .docx in Studio 2021

Hi there,

I am processing docx files, exported from a confluence environment, and they contain different hyperlinks. Some are extracted correctly, making the address and the display text translatable, but others are not extracted correctly, and I could use some help in customizing the file type settings to make it work.

This is what I see for hyperlinks that are extracted correctly:

Screenshot of Trados Studio showing correctly extracted hyperlinks with full addresses and display text highlighted in purple.

And this is what I see for the hyperlinks that are not extracted correctly:

Screenshot of Trados Studio showing a hyperlink not extracted correctly, only displaying 'V-Vendor' without a full address.

The hyperlink is different in that it only references the "V-Vendor" and doesn't give a full address. When we import this back into the Confluence Wiki, that works, we just need to be able to change that "V-Vendor" to match the first letter of the translation of 'vendor', which will obviously not start with a 'V' in all languages.

I have looked at the file type Embedded Content settings for docx, and this is what I have come up with so far, but the manual really doesn't help in getting the right stuff in here, so I was wondering if someone has any idea, based on the above? It would be nice it only those 'peculiar' links would be processed and externtalized, the normal ones work well and it would be a shame if we would get all hyperlink tags as normal text instead of tags.

Screenshot of Trados Studio file type settings for docx Embedded Content with fields for Tag Type, Regular Expression, Start Tag, and End Tag.

Thanks so much!

Janneke



Generated Image Alt-Text
[edited by: Trados AI at 5:37 AM (GMT 0) on 29 Feb 2024]
emoji