Typing Swahili in Arabic script

The main aim of Andika! is to enable Swahili manuscripts in Arabic script to be transcribed digitally in Arabic script - see the section Typesetting poetry. (A second aim, enabling Arabic script to be used easily for current-day writing of Swahili, is covered in the section Spelling conventions. Round-trip conversion between Arabic and Roman scripts is a useful by-product of these two aims, and is covered in the Arabic to Roman and Roman to Arabic sections.)

The keyboard layout proposed here should enable most historical manuscripts to be transliterated letter-for-letter, and since it uses a standard keyboard, and links the Arabic letters to their Roman equivalents, it is easy to start using immediately.

Initial requirements

It is assumed that you have a computer running Ubuntu, and are broadly familiar with how to use it. Ubuntu is a GNU/Linux distribution launched by a South African, Mark Shuttleworth, and the name is cognate with Swahili أُوتُ (utu, humanity).

If you are not currently using Ubuntu, an easy way to get started is to run it from inside Microsoft Windows, using Wubi (Windows-based Ubuntu Installer). Detailed information is available in the Wubi Guide.

It is assumed that you are using the KDE desktop environment, and the instructions here apply to that. Similar options, however, are available if you are using other desktops such as GNOME or Unity. If you wish to use KDE as your own desktop environment, just install the kubuntu-desktop package in Ubuntu.

It is assumed that you are familiar with the basic conventions of the Arabic alphabet (eg the different letter-shapes).

Install a font

In order to see the Arabic script properly, your computer must be able to access a font that includes Arabic letterforms (glyphs). Many fonts nowadays will include some Arabic glyphs, but their attractiveness and coverage varies widely - for instance, many fonts do not include پ (p) or ڠ (g). If you are seeing squares or boxes in the Arabic script, the reason is that the font you are using is missing glyphs.

The best font option currently is Scheherazade, created by Bob Hallissy and Jonathan Kew. To get the most recent version, download ScheherazadeRegOT-1.005.zip from this page, and unzip it. Then double-click the file ScheherazadeRegOT.ttf to install the font on your computer. In Ubuntu, it will be installed into /usr/local/share/fonts/.

The Swahili keyboard layout is already available in Ubuntu, but it needs to be updated (because the existing one is based on initial work I did some time ago), and then activated.

Update the keyboard layout

Download this revised Swahili keyboard layout.

Navigate to where you saved it, and then open a terminal and type:
sudo cp tz /usr/share/X11/xkb/symbols/

Activate the keyboard layout

Click on K → Settings → System Settings:

In the settings dialogues, click on Input Devices → Keyboard:

On the Layouts tab, tick Configure layouts, and then click Add:

Fill in the pop-up dialogue as shown, and then click OK: It should also be possible in the future to access the layout from Swahili (Kenya) too.

Once the new layout is showing in the dialogue, click Apply to exit:

You should now see an additional marker in the system tray at the bottom right of your screen: This shows that the UK English keyboard is the one currently in operation. Click on this, and it will change to: showing that the Swahili keyboard is now operational. You can also switch between the two keyboards by pressing Ctrl+Alt+k.

You can now use the Swahili layout to type Swahili in Arabic script.

Layout conventions

The complete Swahili layout is shown below (with thanks to Wikimedia for the layout image):

(Note that this layout can be easily changed if it does not suit your needs - a section on how to do this will be added later.)

To access the contents of each key, the Shift and AltGr keys are used in combination where appropriate, as shown below:

The basic idea behind the keyboard layout is that the relevant Arabic letter will usually be produced by pressing the same key that produces the Roman letter. Additional points:

  • Digraphs such as dh gh th are placed on the same key as d g t, and accessed using the Shift key.
  • The digraph ch is accessed using the c key.
  • The digraph sh is accessed using the Shift+s keys.
  • The occasionally-used digraph kh is accessed using the x key.
  • Arabic letter variants are placed on the same key where possible, eg ي and ى.
  • Similar sets of Arabic letters are placed consistently - for instance, the pharyngeal consonants ص ض ط ظ are all accessed using the AltGr key, as are ؤ ئ, and the alveolar consonants ٹ ڈ used in Mombasa Swahili are accessed using the Shift+AltGr keys.
  • Long and short vowels are located on the same key, with the long vowel accessed by Shift, and the vowel-carrier accessed by AltGr, so for instance the u key produces ُ and Shift+u produces و, with AltGr+u producing ؤ.
  • The letters و ي are also available on y w for use when they represent semi-vowels.
  • Non-alphabetic characters from the UK keyboard are available via AltGr and AltGr+Shift.

Configure LibreOffice

In order to test the keyboard, we will configure LibreOffice to display text in Arabic script. Launch LibreOffice, and click on Tools → Options → Language Settings → Languages: Set CTL to Arabic (Oman) and tick Enabled for complex text layout (CTL). This will create two new buttons on the icon bar, one for left-to-right typing, and one for right-to-left typing: Click the RTL button to move the cursor over to the right-hand side of the line.

There appears to be a bug in LibreOffice whereby the first letters you type in RTL mode do not use the font you have selected, but the system font (in my case, Deja Vu Sans), so the easiest thing to do is to type something first and then change the font. Switch to the Swahili keyboard (Ctrl+Alt+k), type m, i, Shift+i, m, i, and press Return. Then press Ctrl+a, choose Scheherazade as the font, and set the font size to something large, like 26. You should now have something like this: showing the Swahili word mimi (I, me) in Arabic script.

Continue typing the following words, ignoring commas, and pressing Return after each one:
s, a, Shift+a, s, a - sasa (now)
k, Shift+. (period), w, e, Shift+e, l, i - kweli (truly)
l, u, Shift+u, Shift+g, a - lugha (language)
n, Shift+n, o, Shift+o, m, b, e - ng'ombe (cattle)
AltGr+a, u, m, e, f, i, Shift+i, k, a - umefika (you have arrived)

You should now have something like this:

Other LibreOffice settings

On this site, the Arabic font has purposely been made quite large, so that the details of the text can be seen. You may wish to make the font size smaller. In Arabic, where vowel signs are only rarely used, reading the text is possible at quite small font sizes. In Swahili, however, the vowel signs are essential, so the same reductions in font size are not possible (I find 15pt to be the level at which legibility begins to be impaired, but your eyes might be better than mine!). In typesetting poetry, of course, the lines are short, and accuracy is improved by having a large font.

Some of the vowel signs (eg an e under a word-final ي) can get hidden below the visible window of the line, so it is often useful to set the line-spacing of the LibreOffice document to 1.5: Ctrl+a, then click on Format → Paragraph, select 1.5 linee under Line spacing, and then click OK.

Number-handling when using the Swahili keyboard will depend on the settings under Tools → Options → Language Settings → Complex Text Layout . If you wish to use both Arabic-Indic numerals (on the numeral keys) and Western-Arabic numerals (AltGr+numeral), ensure Arabic or System is chosen here. The other two settings will convert Western-Arabic numerals to their Arabic-Indic equivalents.

There are a couple of dozen Arabic fonts available in the packages fonts-kacst and fonts-arabeyes, but virtually all of these fonts have been designed for Arabic only, and need characters added to them before they are useable with Swahili.

The only fonts that will work properly with Swahili at the time of writing are Scheherazade (Bob Hallissy and Jonathan Kew), Amiri (Khaled Hosny), and the fonts from the PakType project (of which perhaps Tehreer is the most attractive). All of these are in the Naskh style.

The main stylistic differences between Scheherazade and Amiri are that Amiri contains all the Arabic presentation forms (character combinations), making for more attractive text. For instance, Amiri وَلِتُمئَِ (walitumia, they used), compared to Scheherazadeوَلِتُمئَِ, has the letters ltm combined in one ligature.

However, Amiri places all the vowels at the same height from the main letter, eg كُبٗرٖيشَ (kuboresha, to boost) compared to Scheherazadeكُبٗرٖيشَ , and وَنَسَيَانْسِ (wanasayansi, scientists) compared to Scheherazadeوَنَسيَانْسِ. This can lead to the upper vowels from the current line of text colliding with the lower vowels from the previous line, so Amiri may be more appropriate for use with text that is not fully vocalised.

To allow some variation in the fonts used for Swahili, I have adapted one of the Arabeyes fonts (a Kufic one, Granada) to add the characters necessary for it to be used for Swahili - I am grateful to Khaled Hosny for his advice here, but it should be noted that the responsibility for any infelicities caused to this very attractive font is mine alone! The hacked version of the font is available as GranadaKD. It is used as the title font in the sample of poetry in the Typesetting poetry section, and you will need to install this font if you want the titles to display properly.

Regarding the font to be used for close transliteration, note that the readability of diacritics (or even whether they are displayed at all) depends crucially on the font - not all will be capable of showing all diacritics, or placing them in the right location, so if something is not looking right, try using a different font. The font used here for close transliteration is Linux Biolinum O, which is in the fonts-linuxlibertine package.