Skip to main content

Fast Thai Word
Segmentation in Rust

kham is a batteries-included Thai NLP engine — zero external dependencies, no_std core, and ready for Rust, WebAssembly, Python, C, PostgreSQL, and SQLite.

# Cargo.toml kham-core = "0.8"
$ pip install kham
$ npm install kham-wasm

Simple, zero-copy API

Segment Thai text into tokens with byte and Unicode char spans — suitable for search indexing, NLP pipelines, and binding to any language runtime.

Getting started guide →
main.rs
use kham_core::Tokenizer;

let tok = Tokenizer::new();
let tokens = tok.segment("กินข้าวกับปลา");
// ["กิน", "ข้าว", "กับ", "ปลา"]

Try it right now

Powered by WebAssembly — runs entirely in your browser, no server needed.

Samples:

Ready to integrate?

Add kham to your Rust, Python, or Node.js project in minutes.