I’ve been working with Fran Tyers and the Apertium people over the past few months, and one of the issues for any MT system is dealing with the source language text that is fed into it. For interest, I decided to look at how an agglutinative language like Quechua might be dealt with, and the result is a very basic Quechua segmenter – there’s more info on the page. This needs much more work on the code (eg the ability to input connected, punctuated text) and a much bigger dictionary, but it actually works quite well.
-
About me
- I'm Kevin Donnelly, and I live in Llanfairpwllgwyngylch gogerychwyrndrobwllllantysiliogogogoch. Most of my projects relate to linguistics in some form or other (largely Welsh in the past), or to stuff like audio, electronics, typesetting, etc that I find interesting, and that I can work with on GNU/Linux. You can contact me directly on my first name, plus dotmon, and then add a com at the end ...
-
You are currently browsing the archives for August, 2007.
-
Archives
- February 2018
- December 2017
- February 2016
- January 2016
- November 2015
- October 2015
- January 2014
- December 2013
- November 2013
- August 2013
- July 2013
- June 2013
- May 2013
- April 2013
- January 2013
- November 2012
- August 2012
- July 2012
- December 2011
- October 2011
- September 2011
- June 2011
- January 2011
- December 2010
- November 2010
- October 2010
- August 2010
- July 2010
- June 2010
- May 2010
- April 2010
- March 2010
- February 2010
- September 2007
- August 2007
- July 2007
- June 2007
- March 2007
- February 2007
- January 2007