Context:
The Indian government plans to include texts, images, and narratives from religious scriptures across multiple languages such as Hindi, English, Tamil, Telugu, Kannada, Urdu, and others into the AIKosh database. Inputs will also include local dialects, oral storytelling traditions, and inspirational word-of-mouth narratives, seen as valuable for training AI and large language models (LLMs).
Purpose Behind Including Religious and Cultural Texts
- Religious texts are viewed as repositories of ancient wisdom and contextual knowledge, potentially enriching AI model training with deeply rooted ethical, philosophical, and linguistic insights.
- This approach aims to improve accuracy, cultural alignment, and contextual depth of Indian AI applications and LLMs.
Integration of Government Data
- The Ministry of Electronics and Information Technology (MeitY) has signed an MoU with the Lok Sabha Secretariat to use a rich dataset of:
- Parliament questions and answers
- Government reports
- Committee meeting documents
- Ministry-wise agendas
- These will be integrated into AIKosh to support transparent and accountable AI systems rooted in public governance data.
Current Status and Scale
- As of April 9, AIKosh hosts 350+ datasets and supports nearly 150 AI models, including both Large Language Models (LLMs) and Small Language Models (SLMs).
- The initiative is part of the ₹10,372 crore India AI Mission, specifically its India Datasets Platform, one of the mission’s seven core pillars.
Budget and Forward Planning
- In the 2025-26 Union Budget, ₹200 crore was allocated to the AIKosh initiative.
- The platform may also draw from non-personal, anonymized datasets from across various ministries and the Open Governance Data Platform.
AIKosh Platform’s Data Use and Monetisation Policy Clarified
Key Points:
- The AIKosh platform will not allow monetisation of datasets, whether by the government or private sector, as confirmed by an official.
- Minister of State for Electronics and Information Technology, Jitin Prasada, clarified in Parliament that the primary objective of the AIKosh and India datasets platform is to provide access to non-personal public and private sector data for developing AI applications, not for monetisation.
Data Protection and Compliance
- The platform follows stringent data protection standards to ensure the security and confidentiality of user data.
- It adheres to Indian laws, including the Information Technology Act, 2000 and the Data Protection Bill.
- The AIKosh platform does not involve data purchases or subscriptions in any form.