Me, Myself, Why?

Free software and languages, not necessarily in that order…



Words

Fings wot I have wrote

Stuff I've had a hand in writing or publishing in one form or another.

 Go to the website 

Eurfa v3

Free (GPL) Welsh dictionary

The largest Welsh dictionary under a free license, and includes verbal inflections and mutated forms.

 Go to the website 

Andika!

Write Swahili in Arabic script

Tools to make Swahili in Arabic script as easy to use as Swahili in Roman script, with provision for traditional manuscript poetry.

 Go to the website 

KoSajeon

Free (GPL) Korean dictionary

16,000 words searchable in hangeul, English, or romanisation.

 Go to the website 

Utenzi wa Jaafari

Traditional Swahili ballad

Annotated edition of a previously unpublished ballad, using Andika! to produce the Arabic-script text

 Go to the website 

Dramâu Cymru

Corpus of Welsh plays

Showcases plays from Wales, no matter their period or language.

 Go to the website 

BangorTalk

Bilingual conversational corpora

Welsh-English, Welsh-Spanish and Spanish-English corpora for linguistic research on code-switching.

 Go to the website 

Deloof

Jan Deloof's Breton-Dutch dictionary

Detailed dictionary with 40,000 entries

 Go to the website 

Autoglosser v2

Automated glossing for Welsh

New, faster version of the Bangor Autoglosser, aimed at POS-tagging written Welsh text rather than conversational multilingual text.

 Go to the website 

Duval

Terry Duval's Māori gainword corpus

120,000 words (around 6,000 tokens) drawn from citations of gainwords (loanwords or borrowings) in Māori-language publications printed between 1815 and 1899.

 Go to the website 

Kynulliad3

Welsh/English corpus of Assembly Proceedings

360,000 aligned sentences in Welsh and English

 Go to the website 

SiarCorp

Corpus of conversational Welsh

Searchable version of the BangorTalk Siarad corpus

 Go to the website 

PatCorp

Corpus of Patagonian Welsh

Searchable version of the BangorTalk Patagonia corpus

 Go to the website 

MiCorp

Spanish-English conversational corpus

Searchable version of the BangorTalk Miami corpus

 Go to the website 

Gàidhlig

Proof-of-concept Gàidhlig autoglosser

Two small POS-tagged corpora, and a small GPLed dictionary

 Go to the website 

Kwici

Welsh Wikipedia corpus

4m-word corpus drawn from the Welsh Wikipedia as it was on 30 December 2013

 Go to the website 

Korrect

Welsh/English corpus of software translations

43,000 aligned items drawn from projects to translate free or open software into Welsh.

 Go to the website 

Kig

Bob Morris Jones's language acquisition corpora

Web interface to the CIG1 and CIG2 corpora, which focus on child language acquisition in Welsh

 Go to the website 

Autoglosser v1

Tagger for Welsh, Spanish and English

Collection of tools used to POStag the BangorTalk corpora.

 Go to the website 

Māori

Proof-of-concept Māori autoglosser

A small POS-tagged corpus, and a small GPLed dictionary

 Go to the website 

Swwiki

Swahili Wikipedia corpus

A 2.8m-word corpus drawn from the Swahili Wikipedia as of December 2015.

 Go to the website 

Swaseg

Swahili verb segmenter

Allows Swahili verbforms to be segmented for use in parsers or taggers.

 Go to the website 

tikz-pitch-contour

Pitch contours in LaTeX

Gives a visual indication of pitch patterns.

 Go to the website 

apertium-cy

Welsh-English translator

Experimental translator aiming to give at least the gist of a Welsh text in English.

 Go to the website 

langswitcher

Switch languages on webpages

Provides a dropdown the reader can use to select the language of the webpage.

 Go to the website 

Rhymer

Welsh rhyming dictionary

Uses Eurfa to produce lists of rhyming words in order of length, with shorter words at the top of the list.

 Go to the website