How to deal the broken text(font)?


I have a problem with document included different font.

I mostly translated Palī language.
That is a ancient language.
It is only used in Buddhism tradition.

This language is expressed with the Roman alphabet and specific modified the Roman alphabet(āīūĀŪĪṇṁṃñ).

long vowel is expressed like this(āīūĀŪĪ)

Palī language have a lot of related font.
Problem is started here.

In my common window enviroment, when i use some font, there is no problem to see palī language.

e.g
Yo pana bhikkhu bhikkhūnaṃ sikkhāsājīvasamāpanno sikkhaṃ appaccakkhāya dubbalyaṃ anāvikatvā methunaṃ dhammaṃ paṭiseveyya, antamaso tiracchānagatāya pi; pārājiko hoti, asaṃvāso

You can see Rome text with some strange text(ṃīā)
There is no problem.

But see below.

Screenshot showing correctly displayed Pali text 'Bhikkhupatimokkha' and 'Saoghadisesa' with numbers 1 and 5 in Trados Studio.

Bhikkhupátimokkha
Saòghádisesa 1
Saòghádisesa 8

Those are not right expression.
There are some broken font.
If there is any problem,
á is originally ā.
ò is originally ṅ.
á and ò are broken text.

I don't know exactly the mechanism about window font.
Anyway my 'default font'(?) can not rightly show some palī text.

So, I should have some document preprocessing.
e.g

1. open PDF by MS_WORD
2. change broken text one by one.(Ctrl+H,select ò change to ṅ)

Here, I can change font in Trados option.
But If I change font, I may counter some other problem also.

→ I can not use TM and TB which contains 'default font'(can read rightly in window default environment*). *I already say I don't know exact mechanism about font. This is my expression.

So, I have to select default font.
If I do not unify TM or TB. I have to make two different TM or TB(which is comfortable each font)

What would you do if you were me?

Maybe this is not a common problem.
But I think some traslator can counter problem like me.

Is this a best to change broken text one by one?

abstract
1. My source language is broken in some font.
2. Because it has a lot of different font.
2. Is there any method to see font rightly in default font(?) =How to deal the broken text(font)?

I think this is almost common issue, not a trados's. But any ideas you could give me would be greatly appreciated.

By the way I had selected English(United Kingdom) as source language of palī. I think there is no problem.



Generated Image Alt-Text
[edited by: Trados AI at 6:13 AM (GMT 0) on 29 Feb 2024]
emoji
Parents
  • What an interesting question!  I think you're right in that it isn't a common problem.  I believe this is covered by ISO 639-3, but there isn't an appropriate language code in the Microsoft LCID tables that we use, probably because the language is classed as extinct.

    By the way I had selected English(United Kingdom) as source language of palī.

    I was going to ask you which language you chose, and this could have some impact.  It probably makes sense to use a language that uses similar character sets and this is unlikely to be English.  But maybe we can start there? 

    Next would be the font, what font are you using for this language?  I found this here:

    ....not all Unicode fonts contain the necessary characters. To properly display all the diacritic marks used for romanized Pali (or for that matter, Sanskrit), a Unicode font must contain the following character ranges:

    • Basic Latin: U+0000 – U+007F
    • Latin-1 Supplement: U+0080 – U+00FF
    • Latin Extended-A: U+0100 – U+017F
    • Latin Extended-B: U+0180 – U+024F
    • Latin Extended Additional: U+1E00 – U+1EFF

    In the section on "Transliteration on computers" it goes on to list quite a few recommended fonts for this language.

    So I think if we can sort these two questions first:

    1. which language is a more appropriate substitute?
    2. which font should we be using?

    Then we can test again and see if it improves the situation.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
Reply
  • What an interesting question!  I think you're right in that it isn't a common problem.  I believe this is covered by ISO 639-3, but there isn't an appropriate language code in the Microsoft LCID tables that we use, probably because the language is classed as extinct.

    By the way I had selected English(United Kingdom) as source language of palī.

    I was going to ask you which language you chose, and this could have some impact.  It probably makes sense to use a language that uses similar character sets and this is unlikely to be English.  But maybe we can start there? 

    Next would be the font, what font are you using for this language?  I found this here:

    ....not all Unicode fonts contain the necessary characters. To properly display all the diacritic marks used for romanized Pali (or for that matter, Sanskrit), a Unicode font must contain the following character ranges:

    • Basic Latin: U+0000 – U+007F
    • Latin-1 Supplement: U+0080 – U+00FF
    • Latin Extended-A: U+0100 – U+017F
    • Latin Extended-B: U+0180 – U+024F
    • Latin Extended Additional: U+1E00 – U+1EFF

    In the section on "Transliteration on computers" it goes on to list quite a few recommended fonts for this language.

    So I think if we can sort these two questions first:

    1. which language is a more appropriate substitute?
    2. which font should we be using?

    Then we can test again and see if it improves the situation.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
Children
  • which language is a more appropriate substitute?

    Thank you for reply Paul. Sorry to delay reply.
    I thought I wouldn't be able to get a reply.

    First, I think any Roman language is possibles. I used Esperanto and now I use English(United Kingdom) as source language palī. I've never had a problem.(Term cognition, TM, TB, etc)

    which font should we be using?

    Second, When I translate palī, I have not choosed specific font. i.e. I use default font. I don't know how Trados select 'default font'. But my Window 10' default font is '맑은 고딕 (Malgun Gothic)'.

    There is a lot of palī font. And as far as I know there is no most common palī font. 

    Screenshot showing a text excerpt with the phrase 'bhikkhupatimokkhapali nitthita' and a Windows message box displaying a font error for character ' '.

    Here, bhikkhupātimokkhapāḷi niṭṭhitā.

    Break font is ḷ   ← if I copy and paste to notepad, It is expressed as ¿.

    Trados also show the that font as ¿

    Screenshot of a PDF document with text in English, including the title 'PDFI' and no visible errors or warnings.PDF

    I attached the file. 

    emoji


    Generated Image Alt-Text
    [edited by: Trados AI at 6:13 AM (GMT 0) on 29 Feb 2024]