Research
The project proposes to use an interlingua framework rooted in the traditional grammatical system of Sanskrit. The project will focus on Hindi, Sanskrit, and Kannada as a representative subset of Indian languages, ensuring that any observations and innovations for one or more of these languages can benefit others. Towards this primary objective of text-to-text translation, the project will have the following sub-objectives:
- Develop an explainable interlingua interpreter that captures linguistic characteristics of the languages considered and facilitates translation among them.
- Develop a text-to-text (T2T) translation module for translation among text in different languages and the interlingua structure.
- Human benchmarking of explainability and linguistic correctness of interlingua as a post-hoc explainer.
Use Cases
Linguistically faithful and interpretable translation system for the translation of text among multiple Indian languages.
Projects
National Language Translation Mission (NLTM)
Language plays a vital role in shaping national identity, especially in a multilingual country like India, where digital information is spread across many languages. Enabling access to content in native languages is essential, creating the need for a framework to support seamless information sharing across languages.
Under the Technology Development for Indian Languages (TDIL) Programme, efforts have been made to standardize Indian languages for digital use — through UNICODE adoption, web layout standards, and Inscript keyboard layouts. Toolkits for 22 Indian languages have enabled word processing, email, presentations, and web browsing in local scripts.
Research in machine translation, optical character recognition, and speech technologies has further demonstrated the potential for scalable AI-based solutions.
The National Language Translation Mission (NLTM) builds on these foundations to remove language barriers using technology, making governance and policy-related content available in major Indian languages.
Sanskrit Knowledge Accessor
The Sanskrit Knowledge Accessor project aims to bridge the gap between ancient Sanskrit texts and contemporary readers by developing advanced linguistic tools and resources. The project focuses on three key applications and several core objectives:
Key Applications
- Ayurveda Reading Aid for Vaidyas: Develop a tool that assists Ayurvedic practitioners in reading and understanding Sanskrit texts, enhancing access to traditional medical knowledge.
- Darshana or Classics Reading Aid: Create a reading aid to popularize classical Sanskrit literature such as the Upanishads and Mahābhārata, enabling easier comprehension and appreciation.
- Objective-type Question-Answering System on Bhāvaprakāśa Nighanṭu (BPN): Implement a system that allows users to ask objective-type questions about this classical Ayurvedic text and receive precise answers.
Core Objectives
- Development of Sanskrit-Hindi, Sanskrit-English, and Sanskrit-Tamil Accessors: Build tools that support translation and comprehension between these languages.
- Generation of High-Quality Annotated Data: Produce linguistically rich, annotated datasets for training language models and tools.
- Exploration of Traditional Knowledge in Mīmāṃsā and Other Literature: Leverage traditional theories of verbal cognition to better understand and structure discourse.
BHASHINI (Bhasha Innovation)
BHASHINI is a key initiative under the National Language Translation Mission (NLTM) that aims to break language barriers across India by leveraging cutting-edge natural language technologies. Its vision is to create a diverse and inclusive ecosystem that enables seamless communication across Indian languages.
Mission
To bridge the digital, literacy, and language divides by developing voice-first, multilingual solutions that make digital services accessible to all citizens in their native languages.
Key Features
- Multilingual Website Translation: Tools to translate websites into 22+ Indian languages.
- Crowdsourcing via BhashaDaan: A platform where citizens contribute to language resources.
- Mobile App: Enables users to interact in their preferred languages and contribute to the ecosystem.
Ecosystem Components:
- Arpan: Resource sharing
- Sahyogi: Partner and collaborator portal
- Pravakta: Speech tech initiatives
- Gyan Kosh: Knowledge repository
- Parikshan: Testing platform
- Udbhav: Support for startups and innovations
- Talent Samudaye: Talent and developer community
BHASHINI operates as an independent division under the Digital India Corporation, ensuring that every citizen can access digital content and services in their own language.
Learn more or contribute through BhashaDaan.