Skip to main content

Live Demo

Thai word segmentation running entirely in your browser via WebAssembly — no server required. Type any Thai text and click Segment, or pick a sample.

Samples:

Token kinds

  • Thai script
  • Latin
  • Number
  • Punctuation
  • Emoji

Span columns

Chars — Unicode scalar-value offsets, suitable for Python / JS str.slice()

Bytes — UTF-8 byte offsets, used internally by Rust / PostgreSQL FTS.

Coming in v0.3

  • POS tagging (13 categories)
  • Named Entity Recognition
  • Romanization (RTGS)
  • Phonetic codes (lk82 / udom83)