Converting paragraph to sentence segmentation

Hello!

I have a project I created with paragraph segmentation and a project TM. I found myself breaking the paragraphs down into smaller units, although it's true that I never had to deal with unmerge-able units where I needed them merged.

It's too much work, though. I want to shift to sentence segmentation without losing the work I've already done.

The options:

1.) I could go to the TM and change the preferences to sentence segmentation. Will that affect the already-established segments in any way? Will it affect how and when they're applied in Translation Results?

2.) I could create a new project with a new TM, import all the files again, and have them added to that new project TM.

Or perhaps there's another way that would be best? Any advice and answers to these questions will be appreciated.There's a good deal of editing involved in this project in addition to translation, so I want to be able to easily align source docs & translations, edit the translations, and both add them to the TM and save the target in (the source's) Word format.

NB: There's a good deal of editing involved in this project in addition to translation, so I want to be able to easily align source docs & translations, edit the translations, and both add them to the TM and save the target in (the source's) Word format.

Parents
  • Hi  

    Unknown said:
    1.) I could go to the TM and change the preferences to sentence segmentation. Will that affect the already-established segments in any way? Will it affect how and when they're applied in Translation Results?

    Each TU will still contain the contents of your paragraph, so even through you switch to sentence based segmentation it will only apply to new work.  This is because the relationship may not be 1:1 for the contents and Studio has no way of knowing how to handle the re-alignment. So you may have this for example:

    Four sentences in the source and three in the target.  So if you switch to sentence based segmentation then this TU will still be one TU in the database.  If you opened the same source used to create this then you will have four TUs when translated instead of one.  So you won't get any matches from this... possible exception would be if you're using fragment matching in which case you may still get some leverage.

    Unknown said:
    2.) I could create a new project with a new TM, import all the files again, and have them added to that new project TM.

    You could, but you'll get little leverage as I mentioned above so would have to translate them again.

    Unknown said:
    Or perhaps there's another way that would be best? Any advice and answers to these questions will be appreciated.

    I guess what you could do is export your TM to TXT and then align them.

    So I used SDLTMConvert from the appstore to create two files from the TM, one for the source and one for the target and then I aligned them:

    http://appstore.sdl.com/app/sdltmconvert/228/

    This is an excellent application from  that is very useful for things like this.

    Regards

    Paul

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

Reply
  • Hi  

    Unknown said:
    1.) I could go to the TM and change the preferences to sentence segmentation. Will that affect the already-established segments in any way? Will it affect how and when they're applied in Translation Results?

    Each TU will still contain the contents of your paragraph, so even through you switch to sentence based segmentation it will only apply to new work.  This is because the relationship may not be 1:1 for the contents and Studio has no way of knowing how to handle the re-alignment. So you may have this for example:

    Four sentences in the source and three in the target.  So if you switch to sentence based segmentation then this TU will still be one TU in the database.  If you opened the same source used to create this then you will have four TUs when translated instead of one.  So you won't get any matches from this... possible exception would be if you're using fragment matching in which case you may still get some leverage.

    Unknown said:
    2.) I could create a new project with a new TM, import all the files again, and have them added to that new project TM.

    You could, but you'll get little leverage as I mentioned above so would have to translate them again.

    Unknown said:
    Or perhaps there's another way that would be best? Any advice and answers to these questions will be appreciated.

    I guess what you could do is export your TM to TXT and then align them.

    So I used SDLTMConvert from the appstore to create two files from the TM, one for the source and one for the target and then I aligned them:

    http://appstore.sdl.com/app/sdltmconvert/228/

    This is an excellent application from  that is very useful for things like this.

    Regards

    Paul

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

Children
No Data