Live Demo
Thai word segmentation running entirely in your browser via WebAssembly — no server required. Type any Thai text and click Segment, or pick a sample.
Samples:
| # | Text | Kind | Chars | Bytes |
|---|
Token kinds
- Thai script
- Latin
- Number
- Punctuation
- Emoji
Span columns
Chars — Unicode scalar-value offsets, suitable for Python / JS str.slice()
Bytes — UTF-8 byte offsets, used internally by Rust / PostgreSQL FTS.
Coming in v0.3
- POS tagging (13 categories)
- Named Entity Recognition
- Romanization (RTGS)
- Phonetic codes (lk82 / udom83)