Kalenjin · ~5M speakers · Nilotic · Kenya / diaspora
Three demos.
One language.
§ 01 — About Kalenjin
The language and the work.
Read about KalenjinHide
Kalenjin is a Nilotic language spoken by ~5M people across the Kenyan Rift Valley, parts of Tanzania, Uganda, and the diaspora. It's missing from every major LLM and ASR corpus.
This page indexes our open-weights work to fix that. Three demos, one shared evaluation methodology, all on Hugging Face. Speak, type, and listen back — see how a fine-tune holds up against the language it was trained on.
§ 02 — Demos
Three demos. Same evaluation methodology.
01Translate — English in. Kalenjin out.Cascade MT · NLLB-200 + LoRA · 58.79 chrF++ · ~2s warm latencyOpen demo →
02Transcribe — Ng'alal Kalenjin.Low-resource ASR · Whisper-v3-turbo + LoRA · 26.4% CER · ~3s warm latencyOpen demo →
03Grader — Say it back to me.Pronunciation grader · plannedNotify me →
§ 03 — Author
A nights-and-weekends tinkering project by Tony Kipkemboi.