Economic Survey calls for data sovereignty as India trails in AI training startups

Economic Survey calls for data sovereignty as India trails in AI training startups

Without stronger domestic capabilities in data curation and model development, a large share of the economic value from Indian data risks being captured overseas, the Economic Survey said.   

Advertisement
AIAI
Arun Padmanabhan
  • Jan 29, 2026,
  • Updated Jan 29, 2026 5:50 PM IST

India’s startup ecosystem has scaled rapidly, but the Economic Survey 2025–26 highlighted a critical gap in the country’s AI economy: limited participation in startups focused on training data and foundational infrastructure.

While India ranks among the top global contributors to AI research and has one of the world’s most AI-literate labour forces, only about 2% of startups engaged in curating training data are based in India, compared with 40% in the US and 21% in the European Union, the Survey said.

Advertisement

The Survey pointed to India’s vast and diverse domestic datasets across health, agriculture, finance and public administration as a potential comparative advantage that remains underexploited. Without stronger domestic capabilities in data curation and model development, a large share of the economic value from Indian data risks being captured overseas, the Economic Survey said.   

"Given that the current stock of training data is soon expected to run out and models collapse when trained on synthetic data, firms will be on the lookout for new sources of human-generated data that are not accessible through online scraping alone. Policy must remain cognizant of the potential value embedded in India’s data," the Survey said. 

Infrastructure constraints are compounding the challenge. India hosts only about 3% of global data centres by count, far behind high-income economies, limiting access to advanced compute for early-stage companies. At the same time, shortages of GPUs and high-bandwidth memory chips are delaying projects and raising costs.  

Advertisement

Against this backdrop, the Survey outlined a governance framework aimed at retaining strategic control over Indian data while allowing firms to operate global AI pipelines. It said entities that process Indian personal data at scale, particularly for high-impact applications such as general-purpose model training, would be expected to ensure such data remains auditable and retrievable, and subject to oversight by Indian regulators. 

Rather than mandating that all processing happen domestically, the focus is on technical and contractual arrangements that allow authorities to trace data provenance, examine downstream use and, where required, order corrective actions including deletion, retraining or suspension of models.

The Survey said this approach would build on the Digital Personal Data Protection (DPDP) Act, 2023, calling for more precise data categorisation so regulation can be targeted by sensitivity and economic use, instead of treating all data as homogeneous.

Advertisement

A key operational element is the requirement for eligible firms to maintain a mirrored copy of relevant datasets and derived artefacts within India. This, the Survey noted, would preserve regulatory oversight even when processing occurs offshore, while avoiding the rigidity and costs of compulsory in-country processing. 

“In this sense, sovereignty is exercised not through physical containment, but through enforceable rights and institutional capacity,” it said.

The Economic Survey framed AI as a strategic priority rather than a purely commercial opportunity. It warned that reliance on foreign AI platforms could constrain India’s future choices as export controls and technology restrictions increasingly shape global power dynamics.

“AI should not be regarded merely as a technological advancement, but as a strategic priority with far-reaching implications for the future of India’s critical infrastructure, labour market, foreign policy and culture,” the Survey said.

Union Budget 2026 Finance Minister Nirmala Sitharaman is set to present her record 9th Union Budget on February 1, amid rising expectations from taxpayers and fresh global uncertainties. Renewed concerns over potential Trump-era tariff policies and their impact on Indian exports and growth add an external risk factor the Budget will have to navigate.
Track live Budget updates, breaking news, expert opinions and in-depth analysis only on BusinessToday.in

India’s startup ecosystem has scaled rapidly, but the Economic Survey 2025–26 highlighted a critical gap in the country’s AI economy: limited participation in startups focused on training data and foundational infrastructure.

While India ranks among the top global contributors to AI research and has one of the world’s most AI-literate labour forces, only about 2% of startups engaged in curating training data are based in India, compared with 40% in the US and 21% in the European Union, the Survey said.

Advertisement

The Survey pointed to India’s vast and diverse domestic datasets across health, agriculture, finance and public administration as a potential comparative advantage that remains underexploited. Without stronger domestic capabilities in data curation and model development, a large share of the economic value from Indian data risks being captured overseas, the Economic Survey said.   

"Given that the current stock of training data is soon expected to run out and models collapse when trained on synthetic data, firms will be on the lookout for new sources of human-generated data that are not accessible through online scraping alone. Policy must remain cognizant of the potential value embedded in India’s data," the Survey said. 

Infrastructure constraints are compounding the challenge. India hosts only about 3% of global data centres by count, far behind high-income economies, limiting access to advanced compute for early-stage companies. At the same time, shortages of GPUs and high-bandwidth memory chips are delaying projects and raising costs.  

Advertisement

Against this backdrop, the Survey outlined a governance framework aimed at retaining strategic control over Indian data while allowing firms to operate global AI pipelines. It said entities that process Indian personal data at scale, particularly for high-impact applications such as general-purpose model training, would be expected to ensure such data remains auditable and retrievable, and subject to oversight by Indian regulators. 

Rather than mandating that all processing happen domestically, the focus is on technical and contractual arrangements that allow authorities to trace data provenance, examine downstream use and, where required, order corrective actions including deletion, retraining or suspension of models.

The Survey said this approach would build on the Digital Personal Data Protection (DPDP) Act, 2023, calling for more precise data categorisation so regulation can be targeted by sensitivity and economic use, instead of treating all data as homogeneous.

Advertisement

A key operational element is the requirement for eligible firms to maintain a mirrored copy of relevant datasets and derived artefacts within India. This, the Survey noted, would preserve regulatory oversight even when processing occurs offshore, while avoiding the rigidity and costs of compulsory in-country processing. 

“In this sense, sovereignty is exercised not through physical containment, but through enforceable rights and institutional capacity,” it said.

The Economic Survey framed AI as a strategic priority rather than a purely commercial opportunity. It warned that reliance on foreign AI platforms could constrain India’s future choices as export controls and technology restrictions increasingly shape global power dynamics.

“AI should not be regarded merely as a technological advancement, but as a strategic priority with far-reaching implications for the future of India’s critical infrastructure, labour market, foreign policy and culture,” the Survey said.

Union Budget 2026 Finance Minister Nirmala Sitharaman is set to present her record 9th Union Budget on February 1, amid rising expectations from taxpayers and fresh global uncertainties. Renewed concerns over potential Trump-era tariff policies and their impact on Indian exports and growth add an external risk factor the Budget will have to navigate.
Track live Budget updates, breaking news, expert opinions and in-depth analysis only on BusinessToday.in
Read more!
Advertisement