Swahili in Arabic script - easy, quick, versatile!

For centuries, Swahili was written in Arabic script, and hundreds of manuscripts in collections around the world testify to its long tradition of written literature. Over the last century, however, Swahili in Roman script has become the norm.

Andika! (meaning Write! in Swahili) is a set of tools to make Swahili in Arabic script as easy to use as Swahili in Roman script - it is equally easy to read and write the the language in either script. The tools, based on the work of Marehemu Mu'allim Sheikh Yahya Ali Omar [1], provide a consistent, standardised transliteration of Swahili in Arabic script, and a one-to-one mapping of this to Swahili in Roman script.

All the tools are available under the Free Software Foundation's General Public License and Affero General Public License, which means they can be adapted and extended as required by the user, subject to the same license being used for any new version thus created.

The code for the Andika! tools (including these webpages) is available from the GitHub repository. If you have Git installed, you can download the files by running:
git clone https://github.com/donnekgit/andika.git
If not, you can download the files as a compressed zip file.

Further examples of Andika! output, along with detailed information about how to install and use the offline tools to handle manuscripts, are in the detailed manual.

To show Andika! in use, it has been used to create an edition of a hitherto-unpublished ballad (Utenzi wa Jaʿfar, The Ballad of Jaʿfar) from two manuscripts. A variety of outputs are offered, and a paper is in preparation to demonstrate the benefits of using Andika! from a textual analysis viewpoint.

I am always happy to receive comments or suggestions about the further development of the Andika! tools.

Andika! is dedicated to the memory of Sheikh Yahya Ali Omar (1924–2008).
مْزٖئٖ أَكِيفَ، مَكتَابَ هُتٖكٖتٖئَ

[1] Omar, YA in collaboration with Frankl PJL (1997): "An historical review of the Arabic rendering of Swahili, together with proposals for the development of a Swahili writing system in Arabic script." Journal of the Royal Asiatic Society, Series 3, 7, 1: 55-71.

  • Swahili manuscripts in Arabic script can be directly transcribed and made available in digital format. At present, most Swahili literature from earlier periods has only been published in Roman transliteration. The third image in the slideshow above shows part of a manuscript rendering of Bajuni fishing songs, transcribed letter-for-letter from the manuscript.
  • A direct transcription can be augmented with a fully-vocalised Arabic transcription, a close phonetic transliteration, a transliteration in the standard Roman orthography, and so on, according to taste. The tools allow much of these to be generated automatically, reducing the effort this would otherwise involve.
  • Apart from allowing easier typesetting and dissemination, having manuscripts in digital form will make it possible for the first time to use computers to look at word frequency, stylistic variation, etc, within the texts, to build corpora for classical Swahili, and so on.
  • New writing in Swahili can be composed in Arabic script and published easily via word-processors, webpages, or pdfs created by typesetting systems such as LaTeX.
  • The ability to convert Arabic script at any time into Roman script means that there is very little overhead involved in choosing to write Swahili in Arabic script. Material can be produced simultaneously in both scripts with the minimum of effort (although the converted text will need minor editing to cover such things as capital letters, which do not exist in Arabic script).
  • Existing Swahili content in Roman script can be converted to Arabic script, making it possible to reuse content already published in Roman script. This means that large amounts of material in Arabic script can be be made available very quickly - it is not necessary to create them specially.
  • The conversion tools include cut-and-paste boxes for small amounts of text, a Roman-to-Arabic converter for entire webpages, and offline converters for bulk conversion of text.
  • The Arabic-to-Roman conversion can be extended to provide a variety of different transliterations (eg kh or x for خ instead of the standard h).
  • The Roman-to-Arabic conversion can be adjusted to convert numerals, to add or remove markers such as sakani (sukun), and so on.

It's very easy to get started. The most important thing is to install a font that will show all the Arabic characters used in Swahili - once that is done, you can use all the resources on this website.

The best font option at the minute is Scheherazade - to get the most recent version, download ScheherazadeRegOT-1.005.zip from this page, and unzip it. Then double-click the file ScheherazadeRegOT.ttf to install the font on your computer.

You can type Swahili in Arabic script directly into a Linux computer using a standard UK keyboard. Input speed is comparable to typing in Roman script. The Swahili keyboard section includes simple step-by-step instructions for enabling a Swahili keyboard layout on Ubuntu running KDE, and for making changes to it if you have a non-UK keyboard.

Alternatively, if you don't want to do that just yet, go to the Roman to Arabic section. There, you can type into a box in Roman script and have the input converted into Arabic script, or you can input a web address and have that whole page converted into Arabic script. The offline converters allow multiple files to be converted easily.

You can cut and paste the converted Arabic text into a word-processor. See the LibreOffice tab in the Swahili keyboard section.

You can also go to the Arabic to Roman section to convert Arabic script into Roman script (standard Swahili orthography by default, but options are available to add various diacritics to the transliteration). The offline converters allow a transcription of an existing Swahili manuscript in Arabic script to be automatically transliterated into Roman script.

The Spelling conventions section gives suggestions for using Arabic script with current-day Swahili, with many examples of how the Arabic script maps to the Roman script.

The section on Typesetting poetry shows how Swahili poetry manuscripts in Arabic script can be transcribed digitally to produce attractive output in various formats, with the added benefit that the contents of the manuscripts are then available for computer analysis of language, vocabulary, word-frequency, etc.

The images in the slideshow give examples of various pieces of Swahili written in Arabic script. The intention is to show that the Andika! tools can cope with a variety of different requirements - all the copies are transcribed letter-for-letter from the originals.

Image 1

The word andika! (write!) is shown in various Arabic fonts, giving some idea of the variety and expressiveness of the script. Virtually all of these fonts, however, have been designed for Arabic only, and need characters added to them before they are useable with Swahili. See the Fonts tab in the Keyboard section.

Image 2

A copy of the specimen text (Appendix C) from the Omar/Frankl paper, which they included to show how their system would look in practice. The text is a section from: Omar, YA (1998): Three Prose Texts in the Swahili of Mombasa. Mit einer Einleitung von PJL Frankl. Sprache und Oralität in Afrika, Frankfurter Studien zur Afrikanistik, Band 21. Berlin: Dietrich Reimer.

Image 3

A section of the Swahili Wikipedia page on أُتَمَدُونِ (utamaduni, culture), converted using Andika! conventions - see the section Spelling conventions.

Image 4

A copy of the first few verses from a manuscript written by Sheikh Yahya himself, giving a transcription of Bajuni fishing songs. (See: Donnelly, K and Omar, YA (1982): “Structure and association in Bajuni fishing songs”. In: Genres, Forms, Meanings: Essays in African Oral Literature, edited by Veronika Görög-Karady, JASO Occasional Papers 1, Oxford.) The Arabic script in this manuscript differs in a number of minor respects from the one in the Omar/Frankl paper. The Roman conversion in this case uses various diacritics to reconcile the manuscript's representation of the Bajuni dialect with standard orthography. This close transcription (like the default standard transcription) is generated automatically, and can be made the default. Further converters can be set up to reflect other transcription conventions.

Image 5

A copy of verses 3-5 of: Harries, L (1967): Utenzi wa Mkunumbi. A Swahili Potlatch – The Poem about Mkunumbi. Nairobi: East African Literature Bureau. This is one of the few books of Swahili classical poetry to include the text in Arabic script alongside the Roman transcription, in this case a photocopy of a copy of the original manuscript made by Sheikh Yahya. The Arabic script in this manuscript is less well-adapted to Swahili - for instance, o is not used consistently. The Roman transcription is Harries' own - a generated close transcription might be:

dōla mbili ziliwāna * shikuwe nā simba mbawāna
kamaṯezo kushindāna * mṯāna nalayliyā
zikiṯimu siku ṯāṯu * shikuwe kaṯaka wāṯu
kuṯukuwa chāke kiṯu * nḡūbe kay nunuliyā
kaṯiya ngūbe ndiyāni * mema āsiyu lahāni
simba shı̄ kabaı̄ni * mpāni ngūbe mmuyā

  • Type in Arabic script using a standard keyboard
  • Convert from Arabic script to Roman script
  • Convert from Roman script to Arabic script
  • Fine-tune the conversions
  • All code licensed under the GPL/AGPL
Contact me

You can contact me, Kevin Donnelly, at kevin, then a curly at-sign, then dotmon, followed by a full-stop and com.