Kig

 



Kig is a web interface to the CIG1 and CIG2 corpora, which focus on child language acquisition in Welsh. They were assembled by Bob Morris Jones and colleagues. Detailed information about CIG1 and CIG2 is available at the Child Language Databases website, and the transcriptions are available from the CHILDES website.

The search boxes above allow you to search for a word across all files in the CIG1 and CIG2 corpora - when you enter a word, 20 utterances in the corpus containing that word will be shown. For readability, most of the transcription marking is removed.

You can search for words used by a child, or for words used by an adult, "non-child" being defined here as any speaker who is not identified as a child, target child, or playmate.

CIG1, created in 1996, consists of 84 hours of transcribed recordings from children aged 18-30 months, 4 from North Wales (Alaw, Dewi, Elin and Rhys) and 3 from Mid Wales (Bethan, Melisa and Rhian).

CIG2 consists of 120 hours of transcribed recordings from 469 children from across Wales aged 3-7. The recordings were collected in 1974-7, and transcribed in 1999-2000.

Other key parameters of the corpora are set out in the following table:

CIG1CIG2
Files168239
Total utterances78766151422
Total tokens304846566140
Total types549812206
Non-child utterances2528640237
Non-child utterances %32%27%
Non-child tokens222390103755
Non-child tokens %73%18%
Non-child types48694043
Non-child types %89%33%

Download CIG1   Download CIG2