Sentenced-based TM re-segmentation of (imported cell-based) aligned/bilingual content (Excel/XLIFF/SDLXLIFF)

Hi,

I have done an alignment (German/English) via the bilingual excel file type and the complete Excel cells (more than one-sentence, the same happens with bilingual (SDL)XLIFF files) were saved as TM segments in the Studio TM (see attachment). The TM segmentation, however, is set to full stop rule (sentence-based).

I could have copied them into 2 excel files and perform a traditional alignment which would have worked, too, I know. But I wanted to show it using just 1 file. The result – referring to the TM segmentation - strikes me as odd :/. As the (customer-related) data is confidential, I do not post it or any screenshots of it here.

If I import a new excel sheet for translation using the same TM which will then be segmented according to the full stop rule, I can only use upLIFT matching or concordance search, but not fuzzy matching (< 70%).

As far as I know, there is no easy way to re-segment TMs in Studio using the features integrated (e.g. TM maintenance)? Am I right? I also read the following related article that you cannot change segmentation rules for the bilingual excel file type in Studio:

https://community.sdl.com/solutions/language/translationproductivity/f/90/t/8575

I wondered whether you had experience with tools such as Olifant or even a useful script for re-segmenting a Studio TM or exported TMX file? Is there an app for this? I could not find any on the AppStore.

Thanks a lot!

Best regards,
Manuel

Translate

Rate translation

Suggest better translation

Moderator UI

Thread Subject & Description
Sentenced-based TM re-segmentation of (imported cell-based) aligned/bilingual content (Excel/XLIFF/SDLXLIFF) Hi, I have done an alignment (German/English) via the bilingual excel file type and the complete Excel cells (more than one-sentence, the same happens with bilingual (SDL)XLIFF files) were saved as TM segments in the Studio TM (see attachment). The TM segmentation, however, is set to full stop rule (sentence-based). I could have copied them into 2 excel files and perform a traditional alignment which would have worked, too, I know. But I wanted to show it using just 1 file. The result – referring to the TM segmentation - strikes me as odd :/. As the (customer-related) data is confidential, I do not post it or any screenshots of it here. If I import a new excel sheet for translation using the same TM which will then be segmented according to the full stop rule, I can only use upLIFT matching or concordance search, but not fuzzy matching (< 70%). As far as I know, there is no easy way to re-segment TMs in Studio using the features integrated (e.g. TM maintenance)? Am I right? I also read the following related article that you cannot change segmentation rules for the bilingual excel file type in Studio: https://community.rws.com/solutions/language/translationproductivity/f/90/t/8575 I wondered whether you had experience with tools such as Olifant or even a useful script for re-segmenting a Studio TM or exported TMX file? Is there an app for this? I could not find any on the AppStore. Thanks a lot! Best regards, Manuel
Get AI Suggestion

AI Reply

Accept answer Reject Answer

Parents

0 Paul over 7 years ago

Hi Manuel Hörmann ,

I admit you have confused me completely with this question. Why are you aligning a bilingual excel file? Surely this is already aligned? Also it is true that segmentation rules cannot be changed for the bilingual excel filtype and this is because there may not be a correlation between the source and target, so we only segment on the cells themselves to avoid errors.

For resegmenting, maybe you can export to a different format and realign? The SDL Convert application can create a text file for source, and one for target from your SDLTM. See this article for details:

multifarious.filkin.com/.../

Regards

Paul

Paul Filkin | RWS Group

________________________
Design your own training!
You've done the courses and still need to go a little further, or still not clear?
Tell us what you need in our Community Solutions Hub
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Reject Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate

Reply

0 Paul over 7 years ago

Hi Manuel Hörmann ,

I admit you have confused me completely with this question. Why are you aligning a bilingual excel file? Surely this is already aligned? Also it is true that segmentation rules cannot be changed for the bilingual excel filtype and this is because there may not be a correlation between the source and target, so we only segment on the cells themselves to avoid errors.

For resegmenting, maybe you can export to a different format and realign? The SDL Convert application can create a text file for source, and one for target from your SDLTM. See this article for details:

multifarious.filkin.com/.../

Regards

Paul

Paul Filkin | RWS Group

________________________
Design your own training!
You've done the courses and still need to go a little further, or still not clear?
Tell us what you need in our Community Solutions Hub
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Reject Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate

Children

0 Manuel Hörmann over 7 years ago in reply to Paul

Thanks, Paul! I'll definitely check the SDL Convert option regarding my use case. I did not mean to confuse you or anybody. However, I had a multilingual file (6 languages contained in separate columns in one Excel file) which I thought would be a brilliant use case for the bilingual excel file type without much manual preparation. I even tried it with a TMX creation using the Glossary Converter that produced the same unsatisfactory (cell-based) result regarding segmentation. In case of a traditional alignment, I need to split up a file into individual files for alignment which is not required when you use the bilingual excel file where everything is already contained and you just have to point to the respective columns. The downside, however, is the (non-configurable; cell-based) segmentation that you have to fix afterwards. My idea would be that even if allow cell-based segmentation for a sentenced-based TM, that you might fix the segmentation afterwards while re-importing (via SDLXLIFF, TMX etc.). Other CAT tools already provide such an option while importing (Import tranlsation unit with sentence segementation):

Otherwise, there will be no or significantly fewer matches (given the >=70 percent fuzzy match settting) and customers starting with little legacy data might not fully benefit from upLIFT (especially, TU fragments) and TM reuse.

Regards,
Manuel
Cancel
Vote Up 0 Vote Down

Sign in to reply

Verify Answer

Cancel

Share
Documentation Survey: help us offer you better documentation! Translate

Trados Studio > 1. Trados Studio

Sentenced-based TM re-segmentation of (imported cell-based) aligned/bilingual content (Excel/XLIFF/SDLXLIFF)