Latest posts for tag twabo

I'm currently in Cilamitay, in the east of Taiwan. There is a little meeting of Taiwanese Free Software people and people from the Amis, Taroko and Puyuma tribes, with the idea of starting localisation efforts for some aboriginal languages.

These are some of the issues we are going to discuss:

Language code

A new ISO standard (639-3) will hopefully be formalised in January that will include the language codes for the Taiwanese aboriginal tribes. We'll have to work some temporary solution, but there's good hope that it won't have to be temporary for long.

List of characters

Because of Christian missionary influence, both Amis and Taroko use a roman alphabet, with accents. We need to work out the complete list of character and accent combination, see if everything is in Unicode, see how they sort.

We then need to find a comfortable way to input them using the keyboards normally available here (English US layout): compose key? Dead keys? How about on Windows?

Womble2 on IRC tells me that on Windows one can works with MSKLC.

Technical terms and country list

We need to work out how to map terms that do not exist in the language.

Technical terms are usually borrowed from Japanese.

Names for all the countries in the world probably do not exist.

Translation interface

We need to find an easy to use interface to input the translations.

There is Rosetta.

There is Pootle. (Thanks to Christian Perrier for pointing me at it)

There is Webpot.

Update: there is now a wiki page on the Debian wiki.

Arne Götje (高盛華) created:

The scripts, especially Amis, make heavy use of Unicode combination characters. They should display well at least with the Dejavu Sans font in many applications.

Try it out: if it displays correctly, you should see:

  • accented letters instead of letters next to accents.
  • i with both the dot and the accent.

Update: there is now a wiki page on the Debian wiki.

We mapped the available glyphs and accents for the Paiwan language.

The letters in alphabetical order:

a b c d e f h i j k l m n p q r s t u v w y z ḏ nġ ḻ ṟ ṯ

No uppercase.

Update: this character list has been improved and the good version is found in the Debian wiki.

All the characters are in Unicode except nġ, which already needs to be requested for the Amis script.

We need to design an input method to enter the underlined letters and the nġ.

Update: there is now a wiki page on the Debian wiki.

A year ago we got in touch with various Taiwanese aboriginal tribes to try to start localisation efforts.

Thanks to the research the Taroko people did during 2007 and the prototype work of tonight, the Taroko people in Taiwan can see the computer calendar of the new year in their own language:

We mapped the available glyphs and accents for the Amis language.

The letters in alphabetical order:

    a c d f ng h i k l m n o p r s t u w y

Everyone of them can get an acute or circumflex accent on top. ng can get a dot on top of the g.

The accents are literally on top: i would get the dot PLUS the accent on top.

Not all accented characters directly exist in Unicode; however Unicode developed various kinds of combination features to take care of these cases.

Then we need an input method that would insert ng instead of g and allow to type all the accent combinations.

Here is the full character set:

    a     á    â
    c     ć    ĉ
    d     d́    d̂
    f     f́    f̂
    ng    nǵ   nĝ  nġ
    h     h́    ĥ
    i     i̇́    i̇̂
    k     ḱ    k̂
    l     ĺ    l̂
    m     ḿ    m̂
    n     ń    n̂
    o     ó    ô
    p     ṕ    p̂
    r     ŕ    r̂
    s     ś    ŝ
    t     t́    t̂
    u     ú    û
    w     ẃ    ŵ
    y     ý    ŷ

Update: this character list has been improved and the good version is found in the Debian wiki.

The list is not displayed correctly with many fonts or rendering engines. Arne made a test page that explicitly sets a font that works.

The accents are not taken into account when sorting.

Uppercase letters are not used.

Note: the page has been updated to reflect further input from Unicode and Amis people.

Update: there is now a wiki page on the Debian wiki.