Segmentation rule - in TMs? What about before creating a TM? Segmenting is being done!

I'm reading the segmentation documentation. I'm missing something - it seems to suggest that segmentation rules are in TMs. But before one ever creates a TM, one begins translating in Studio. And it's breaking the original text into segments.  For literary translation I want the segments to be smaller - much smaller. Not sentences, but phrases. In short, what am I missing? Thanks in advance.

Parents
  • Hi John,
    When you begin translating in Studio, you will probably have created your very first TM already. Any single file opened will follow the default rules of the TM that appears at the top of the list under General Options. (File>Options>Language Pairs>All language pairs>Tm). Click the TM at the top of the list (or the only one there), then click Settings>Language Resources>Segmentation Rules>Edit.

    Here, you can select Paragraph based segmentation. Perhaps this is what you're looking for? Studio will start a new segment every time it comes to a hard carriage return.

    After changing any default segmentation rules, you will need to process your file again for the new rules to be applied.
    HTH,
    Emma

  • I see, that makes sense. FYI, I'm looking for segmentation by the smallest possible unit. The units of translation, functionally, are words and phrases. In truth, I find the basic principle of choosing longer segments surprising. What I mean is: the program specifically defaults to NOT breaking at a semicolon - and yet you should have complete grammatical sentences before and after a semicolon. Why would you want to establish a standard only for this COMBINATION of sentences, and not for each sentence in the combination?
  • Unknown said:
    I see, that makes sense. FYI, I'm looking for segmentation by the smallest possible unit. The units of translation, functionally, are words and phrases. In truth, I find the basic principle of choosing longer segments surprising. What I mean is: the program specifically defaults to NOT breaking at a semicolon - and yet you should have complete grammatical sentences before and after a semicolon. Why would you want to establish a standard only for this COMBINATION of sentences, and not for each sentence in the combination?

     
    Paragraph segmentation is ideal if you want to restructure the paragraph, which is why it could be useful in literary translation.
    A semicolon doesn't break the sentence into two segments by default because a translator might well want to build the sentence differently, without a semicolon, and want the whole sentence in one segment to work on it. Of course this depends on the language pair, the genre and the author's style. Hence the option to split and merge segments on an individual basis.
     

    What use would a TM containing such (long) segments be to anyone?
    The TM itself may not be much use, although it will give you a bigger view of a chunk, that may help in the future. The whole idea is to be able to restructure the target language paragraph as you translated. 
     

     Ideally, I'd be able to have two entries - the one that'll be expressed in "Save target as" and a (potentially very different) one that would go into the TM. 
    No problem: translate the segment and add it to the TM by clicking "confirm but do not move to next segment". Then translate it again and click Ctrl+down arrow to move to the next segment. If you like, you can change the segment status to "translated" by right-clicking it; this doesn't save it to the TM.
Reply
  • Unknown said:
    I see, that makes sense. FYI, I'm looking for segmentation by the smallest possible unit. The units of translation, functionally, are words and phrases. In truth, I find the basic principle of choosing longer segments surprising. What I mean is: the program specifically defaults to NOT breaking at a semicolon - and yet you should have complete grammatical sentences before and after a semicolon. Why would you want to establish a standard only for this COMBINATION of sentences, and not for each sentence in the combination?

     
    Paragraph segmentation is ideal if you want to restructure the paragraph, which is why it could be useful in literary translation.
    A semicolon doesn't break the sentence into two segments by default because a translator might well want to build the sentence differently, without a semicolon, and want the whole sentence in one segment to work on it. Of course this depends on the language pair, the genre and the author's style. Hence the option to split and merge segments on an individual basis.
     

    What use would a TM containing such (long) segments be to anyone?
    The TM itself may not be much use, although it will give you a bigger view of a chunk, that may help in the future. The whole idea is to be able to restructure the target language paragraph as you translated. 
     

     Ideally, I'd be able to have two entries - the one that'll be expressed in "Save target as" and a (potentially very different) one that would go into the TM. 
    No problem: translate the segment and add it to the TM by clicking "confirm but do not move to next segment". Then translate it again and click Ctrl+down arrow to move to the next segment. If you like, you can change the segment status to "translated" by right-clicking it; this doesn't save it to the TM.
Children
No Data