Data-Driven Natural Language Processing (NLP) for Indic Languages

Situation: Over 2 billion people speak Indic languages as their mother tongue in the Indian subcontinent, but at the start of the millennium, few natural language tools existed for these languages.
Approach: Our team used data-driven approaches to build the first generation of Indian language spellcheckers (trie-based), grapheme-to-phoneme mappers (rule based), and text-to-speech systems. The tools were deployed in multiple Indian government institutions 6 years before private enterprises like Google publicly released Indian language processing tools.

Likes (0)