Excluding any text that isn't English from the source section in a bilingual XML

I have a bilingual XML file, is there a way of automatically excluding any text that isn't English from the source section when processing the file in Trados? It's currently a mix of Arabic and English in the source, and I only want to translate the English into Arabic. I could go through the file manually and lock all of the Arabic sections, but this is a really large file so would take too long, is there a way of doing this automatically/creating a setting that excludes the Arabic in the first place? Thank you!

I have SDL Trados Studio 2021 - 16.0.2.3343

Translate

Rate translation

Suggest better translation

Moderator UI

Thread Subject & Description
Excluding any text that isn't English from the source section in a bilingual XML I have a bilingual XML file, is there a way of automatically excluding any text that isn't English from the source section when processing the file in Trados? It's currently a mix of Arabic and English in the source, and I only want to translate the English into Arabic. I could go through the file manually and lock all of the Arabic sections, but this is a really large file so would take too long, is there a way of doing this automatically/creating a setting that excludes the Arabic in the first place? Thank you! I have SDL Trados Studio 2021 - 16.0.2.3343
Get AI Suggestion

AI Reply

Accept answer Reject Answer

Top Replies

Paul Filkin over 3 years ago in reply to Miranda Sambidge +2 verified

Miranda Sambidge In the meantime I did try to solve this and couldn't! But I did get a bit of help as it was an interesting one for my own learning journey so I created a small example and asked in stackoverflow…

Parents

0 Paul Filkin over 3 years ago

Miranda Sambidge

Miranda Sambidge said:
I have a bilingual XML file, is there a way of automatically excluding any text that isn't English from the source section when processing the file in Trados?

Probably.

Can you provide a sample of the XML as this is the only way you'll be able to get concrete answer to this question.

Paul Filkin | RWS

Design your own training!
You've done the courses and still need to go a little further, or still not clear?
Tell us what you need in our Community Solutions Hub
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Reject Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Miranda Sambidge over 3 years ago in reply to Paul Filkin

Thank you Paul! Any chance I could send you an email with the file? Not sure I can share it publicly unfortunately as we have a few NDAs in place with the client. Thank you!
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Miranda Sambidge over 3 years ago in reply to Paul Filkin

Thanks for this Paul, let me have a look and get back to you!
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
+1 Paul Filkin over 3 years ago in reply to Miranda Sambidge

Miranda Sambidge

In the meantime I did try to solve this and couldn't! But I did get a bit of help as it was an interesting one for my own learning journey so I created a small example and asked in stackoverflow:

https://stackoverflow.com/questions/74386847/only-xpath-for-extracting-text-for-multiple-conditions-in-xml-no-code-possible

Got the answer just now, which I would not have been able to do myself:

//*[text()[normalize-space()]][not(ancestor-or-self::*/@countries) or contains(ancestor-or-self::*[@countries][1]/@countries, 'ME')]

Like this in Studio:

This seems to do the trick. This is possibly the most complicated XML XPath I've tried to solve, mainly required because the source file itself isn't well thought out for localization purposes. The simplest way, and potentially error free, would have been to add the supported countries for every element containing translatable text and then you could always and easily pull out the languages you need.

However, this is pretty clever and you can easily adapt by changing the language code for the languages you need.

Paul Filkin | RWS

Design your own training!
You've done the courses and still need to go a little further, or still not clear?
Tell us what you need in our Community Solutions Hub

Generated Image Alt-Text
[edited by: Trados AI at 9:20 AM (GMT 0) on 29 Feb 2024]
Cancel
Vote Up +2 Vote Down

Sign in to reply

Verify Answer

Reject Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Miranda Sambidge over 3 years ago in reply to Paul Filkin
Hi Paul,

Thank you again for your patience regarding this reply and for your continued help in general! I've sent it onto the client to provide more insight and this is their response. Does this help at all? They've mentioned that they're happy to jump in on a call if that will help clarify anything, let me know if that's something you'd be interested in, no worries if not. Their reply is as follows, your original message is in red:

Thanks for the update. So, there will be mixes of both depending on the context sometimes an entire section (parent) will be country applicable, others the parent is universally used but there will be a section (child) that’s only applicable to a certain country.

As an example for the latter would be say a full size spare wheel on DBX. Everywhere can have it, but certain countries it is an option (default being can of tyre foam), so there may be a P medium (essentially a line of text) which would have text saying Optional in the markets where it doesn’t come as standard, and would be country tagged.

An example of a parent being tagged could something like the tracker, as the whole feature is only available in certain markets. This could then get further filters within it by then having individual country tags on P or Description, again for certification text. For example same tracker functionality throughout Europe but different certification in Russia to German.

Hopefully that makes sense?

To try and answer the two examples given:

First Example [your first screenshot]:

You can see that only the parent elements contain the country. In the first para it's in this:

<Description medium="all" countries="TR,ME,IL,CN">

In the next it's in this:

<P medium="all" countries="TR,ME,IL,CN">

Should I be extracting nothing at all since none of these child elements contain that attribute themselves, or should I simply be extracting them all because they don't have the country attribute as per your second rule above?

Or should I be extracting only the child elements like this, so the last one doesn't get extracted?

So here, the first bit would the entire description is included/excluded depending on country tag

The second paragraph, the whole description would be included, but the <P medium="all" countries="TR,ME,IL,CN"> bit would be included/excluded.

At a guess from the coding above (and because I’m sad and remember these things) that would be the steering column adjustment. The top Description would be describing manual column adjustment which only applies in those particularly regions. The second Description would be electric column adjustment, available on every variant of car, but as an optional feature for TR/ME/IL/CN, and the text that has the country tags would probably say optional. Hopefully giving an example of how that would be used in the book clears that one up.

Second example [your third screenshot]:

The translatable text is all in child elements of Note, and there is a country rule applied. But then the individual child elements of Note seem to override it. So this suggests I should pay attention to child elements and apply the attribute from the parent... unless they are overridden.

Ok so here, this might be more of a fringe case as I’d usually do a separate note for each entry rather than a single parent note and then separate text for each entry inside it with no single P entry to apply to all of them – for example there was another P medium in there with no country tag so would be in for all NR,KR,GB,TR, but essentially both tiers (Note and P) would be included for respective translation?
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Paul Filkin over 3 years ago in reply to Miranda Sambidge

Miranda Sambidge

I think this is not needed now. I gave you the solution in my last post. Have you tried it?

//*[text()[normalize-space()]][not(ancestor-or-self::*/@countries) or contains(ancestor-or-self::*[@countries][1]/@countries, 'ME')]

Paul Filkin | RWS

Design your own training!
You've done the courses and still need to go a little further, or still not clear?
Tell us what you need in our Community Solutions Hub
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Reject Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Miranda Sambidge over 3 years ago in reply to Paul Filkin

Apologies Paul, I didn't see it! I will try it and get back to you!
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Miranda Sambidge over 3 years ago in reply to Paul Filkin

Hi Paul,

I've tried this solution (so has another of my colleagues), but we're getting the same word count that we do if there is no filter applied at all. I'm wondering if we're not doing it correctly. Just to check, do we need to create a new file type, or can I just add it to the pre-existing XML file type in Trados? There are other XPaths within that file type of course, but I added the new one and moved it to the top of the list so that Trados prioritises it. I've also created a new file type and added the XPath, but again nothing changed in the output.

The settings that I used are as follows, which seem to match yours, are they correct?

If I create a new file type and try to match your view, this is what I've got, but it's still just importing everything and not filtering out the bits that we don't want.

Thanks Paul!

Generated Image Alt-Text
[edited by: Trados AI at 9:20 AM (GMT 0) on 29 Feb 2024]
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Miranda Sambidge over 3 years ago in reply to Paul Filkin

Hi Paul,

Just to check that you received my below message about the setting not working for us? I want to check that what we're doing is correct. The client is wanting to move from Across to Trados so we need to get the settings file in place before they make the move so they're currently chasing for an update.

Thank you in advance for your continued help!
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Paul Filkin over 3 years ago in reply to Miranda Sambidge

Miranda Sambidge

No, I didn't see it. If you tag me it helps (type @ and select me from the list that appears) as I'll see the notification.

Miranda Sambidge said:
do we need to create a new file type,

Better to create a new one. Then create your project again. Also make sure that your project is actually using your new filetype. Just check the file identifier column in the files view and make sure it' the one you expected to use.

I can't check as I can't find the example you sent me (you used some other name I don't recall).

Paul Filkin | RWS

Design your own training!
You've done the courses and still need to go a little further, or still not clear?
Tell us what you need in our Community Solutions Hub
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Paul Filkin over 3 years ago in reply to Paul Filkin

Miranda Sambidge

Miranda Sambidge said:
There are other XPaths within that file type of course, but I added the new one and moved it to the top of the list so that Trados prioritises it.

Why? Why are there other xpaths when you provided exactly what rules were required for this file? I suggest you remove all the other parser rules you have and only use the ones I provided. Then see if the expression works.

But really... what are the other rules for and do you understand how they might influence the rules I gave you?

Paul Filkin | RWS

Design your own training!
You've done the courses and still need to go a little further, or still not clear?
Tell us what you need in our Community Solutions Hub
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Miranda Sambidge over 3 years ago in reply to Paul Filkin

HI Paul Filkin

Apologies if my message wasn't very clear, I tried two workflows:

1) I created a new file type which then just contained your XPath:

//*[text()[normalize-space()]][not(ancestor-or-self::*/@countries) or contains(ancestor-or-self::*[@countries][1]/@countries, 'ME')]

And the Xpath:

//*

as per your screenshot, however the analysis was the same as if I didn't apply any special settings

2) I also added the XPath to the existing XML file type in Trados, to see if I needed to do that instead (which is why I mentioned that there are other default XPaths within that file type), but again there was no difference to the w/c

I've tried this multiple times now, tweaking the settings each time and ensuring that the project is using the correct settings but I receive the same w/c each time, and it does not omit the text that isn't for translation.

Would you be able to export and send me the settings that you created? As I say, I've tried multiple times now and have made small adjustments each time, but I just can't find a combination of settings that ensures that this XPath works. I feel like I'm missing something but can't see what, so thinking that having your settings file and being able to recreate it exactly would help my understanding.

Is that ok? Thanks Paul!
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate

Reply

0 Miranda Sambidge over 3 years ago in reply to Paul Filkin

HI Paul Filkin

Apologies if my message wasn't very clear, I tried two workflows:

1) I created a new file type which then just contained your XPath:

//*[text()[normalize-space()]][not(ancestor-or-self::*/@countries) or contains(ancestor-or-self::*[@countries][1]/@countries, 'ME')]

And the Xpath:

//*

as per your screenshot, however the analysis was the same as if I didn't apply any special settings

2) I also added the XPath to the existing XML file type in Trados, to see if I needed to do that instead (which is why I mentioned that there are other default XPaths within that file type), but again there was no difference to the w/c

I've tried this multiple times now, tweaking the settings each time and ensuring that the project is using the correct settings but I receive the same w/c each time, and it does not omit the text that isn't for translation.

Would you be able to export and send me the settings that you created? As I say, I've tried multiple times now and have made small adjustments each time, but I just can't find a combination of settings that ensures that this XPath works. I feel like I'm missing something but can't see what, so thinking that having your settings file and being able to recreate it exactly would help my understanding.

Is that ok? Thanks Paul!
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate

Children

0 Paul Filkin over 3 years ago in reply to Miranda Sambidge

Miranda Sambidge

Can you send me the file again please? I don't keep all the files I'm sent... tend to test, see if I can fix and then discard.

Thank you.

Paul Filkin | RWS

Design your own training!
You've done the courses and still need to go a little further, or still not clear?
Tell us what you need in our Community Solutions Hub
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Miranda Sambidge over 3 years ago in reply to Paul Filkin

Paul Filkin

Will do that now!
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Paul Filkin over 3 years ago in reply to Miranda Sambidge

Miranda Sambidge

Thank you. If I use anyXML (default) I get an analysis like this using an empty TM:

If I use a custom filetype with the two rules I get this:

I have put my settings file here:

AR_only.zip

Take a look... see if you get the same results as me and if you do then just compare screen by screen what I have that's different in the settings.

Paul Filkin | RWS

Design your own training!
You've done the courses and still need to go a little further, or still not clear?
Tell us what you need in our Community Solutions Hub

Generated Image Alt-Text
[edited by: Trados AI at 9:20 AM (GMT 0) on 29 Feb 2024]
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Reject Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Miranda Sambidge over 3 years ago in reply to Paul Filkin

Paul Filkin

Hi Paul, thanks for sending that over. I've checked the settings on our side, and have asked the client to check that the file with the settings applied only contained the text that they needed for Arabic, and they've confirmed that it does - so thank you! Using these settings we've been able to copy them for the other languages that we needed to, so they're really helpful.

I've noticed something about the settings however, the tags within the project disappear when the settings are applied. If I create a project without the settings, the tags are present, however when they're applied then the tags disappear. Do you know why this might be the case?

Thanks Paul!
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Paul Filkin over 3 years ago in reply to Miranda Sambidge

Miranda Sambidge

I don't have your files anymore... you take so long between posts I didn't think you would come back! However, the tags disappearing is probably down to different segmentation as a result of the rules, and all tags that are now on the outside edges of a segment will be moved out as they don't need to be internalised.

Or are you telling me that the tags are missing in the translated XML files?

Paul Filkin | RWS

Design your own training!
You've done the courses and still need to go a little further, or still not clear?
Tell us what you need in our Community Solutions Hub
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Miranda Sambidge over 3 years ago in reply to Paul Filkin

Paul Filkin

As usual, I think you're right and I panicked prematurely! Thanks Paul, you've been such a help!
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate
0 Paul Filkin over 3 years ago in reply to Miranda Sambidge

Miranda Sambidge

Phew!

Paul Filkin | RWS

Design your own training!
You've done the courses and still need to go a little further, or still not clear?
Tell us what you need in our Community Solutions Hub
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate

Trados Studio > 1. Trados Studio

Excluding any text that isn't English from the source section in a bilingual XML

Top Replies