graphemes++

graphemes++

A grapheme-aware toolkit for evaluating Tamil and Sinhala text. One visually-perceived character is often several Unicode code points - these tools measure at the grapheme level, complementing character-based metrics like chrF.