Zero-width Non-joiner

Hi,

As you may know, we have a character called "zero-width non-joiner" (ZWNJ) in the Persian language, which is very functional. Normally, you use space between two separate words, but when you want to separate a single word that has two parts, you need to use ZWNJ. As you can see in the attached picture, problem here is that SDL Trados recognizes it as a tag and makes the project messy. It would be great if SDL Trados could treat it as other characters.

BTW, SDL calls this character NoWidthOptionalBreak.

 

 

Parents
  • Hi Alireza,
    in the meantime, as long as this character is not shown "like any other character" as you suggest, I would erase it from the source files before translation. I guess this character is only usefull in a certain "layout" context and is certainly not relevant for translators and target languages.
    This would also solve the problem you describe in another post with the term recognition.

    Kind regards
    Sébastien
  • Hi Sébastien,
    Thank you for your reply. Actually, this character is very important in Persian. So what you suggested would be so frustrating, and somehow infeasible, as there are lots of ZWNJ in Persian texts and replacing them with full space could cause trouble.
  • Sebastien's suggestion is to REMOVE them, not replace with spaces.
    The point is, if it's ZERO WIDTH, you should not be able to actually SEE them, right? So it should be okay to remove them as it should not make any visual difference...
    Sorry if this sounds stupid, I'm not really familiar with these particularities of Persian.
  • Hi Evzen,
    In fact, removing it is much worse than replacing it, because the word that needed to be separated by ZWNJ will be joined, and returning it back will be a nightmare. FYI, look at the example below:

    می خواهم (with full space) - می‌خواهم (with ZWNJ) - میخواهم (without any)
  • Hi Alireza,
    Indeed, that was not a good idea. I understand more with your examples. I made some tests and saw that the conversion of a txt file with this special character is ok - there is no tag displayed - and all 3 words are displayed correctly. So it seems to be in MS-Word, that the conversion into a tag first takes place and Studio is just displaying this tag instead of converting it back to its character form. An enhancement in the word filter would be necessary.
    I guess translating the text in txt-file or displaying the tag "without text" are no good ideas neither...

    Kind regards
    Sébastien
Reply
  • Hi Alireza,
    Indeed, that was not a good idea. I understand more with your examples. I made some tests and saw that the conversion of a txt file with this special character is ok - there is no tag displayed - and all 3 words are displayed correctly. So it seems to be in MS-Word, that the conversion into a tag first takes place and Studio is just displaying this tag instead of converting it back to its character form. An enhancement in the word filter would be necessary.
    I guess translating the text in txt-file or displaying the tag "without text" are no good ideas neither...

    Kind regards
    Sébastien
Children
  • Unknown said:
    So it seems to be in MS-Word, that the conversion into a tag first takes place and Studio is just displaying this tag instead of converting it back to its character form.

    Actually it seems to be rather in the filter than in Word...
    I copied and pasted the example to Word file, saved it and then examined the internal XML - there is no special tag in there, it's just the character itself: