Skip to main content

Live Demo

Thai NLP running entirely in your browser via WebAssembly — no server required. Segment text, explore POS & NE tags, split sentences, or test phonetic matching.

Samples:

Phonetic Soundex

12 consonant groups · 4-char code · most widely used

Code
vs
Code
Try:

Text Normalizer

Collapses duplicate tone marks and composes nikhahit + sara aa into sara am (อำ).

Try:
Before
After

Number Conversion

Thai number words, Thai digits (๐–๙), and Baht currency text.

Thai word
Try:

Token kinds

  • Thai · Named entity
  • Latin
  • Number
  • Punctuation
  • Emoji

FTS mode extras

  • POS — 13 ORCHID-derived categories
  • NE — Person · Place · Org
  • Stop — built-in stopword list
  • Roman — RTGS romanization toggle
  • Synonyms — number normalization

Soundex algorithms

  • LK82 — 12 groups · 4-char code
  • Udom83 — 14 groups · finer sibilants
  • MetaSound — 3 chars per syllable
  • Used for fuzzy / phonetic FTS search