The turning point appears to be Sarvam Vision, the company’s advanced multimodal AI model launched on February 5.
The turning point appears to be Sarvam Vision, the company’s advanced multimodal AI model launched on February 5.Menlo Ventures partner Deedy has publicly walked back his earlier criticism of India-based AI startup Sarvam AI, calling the company’s recent progress “really valuable” and praising its technological execution. In a post on X, Deedy said he was initially unconvinced by Sarvam’s focus on training smaller Indic-language models, arguing a year ago that the approach was misguided.
“That view was wrong,” he admitted. According to Deedy, Sarvam has since emerged with what he describes as the best text-to-speech, speech-to-text, and OCR models for Indic languages. He highlighted the startup’s reasonable pricing, ease of use, and polished product design, noting that Sarvam is addressing a gap that larger global AI labs are unlikely to prioritise in the near term.
While clarifying that he has no insight into Sarvam’s business metrics, Deedy said the company’s technical achievements stand out, adding that he “can’t remember the last time” a software product from India left such a strong impression.
Stark contrast with last year’s critique
Deedy’s praise marks a sharp contrast with his comments from May 24, 2025, when he was openly critical of Sarvam’s early flagship language model. At the time, he described the launch of Sarvam’s 24-billion-parameter LLM — post-trained on Indic data — as underwhelming, pointing out that it had logged just 23 downloads two days after release.
He compared this with a Korean open-source model trained by two college students that reportedly clocked nearly 200,000 downloads in a single month, calling the disparity “embarrassing.” That criticism reflected broader skepticism around India’s ability to produce globally competitive foundation models — sentiment that Sarvam’s latest releases appear to be challenging.
Sarvam Vision sets new benchmarks
The turning point appears to be Sarvam Vision, the company’s advanced multimodal AI model launched on February 5. Built around a 3-billion-parameter in-house vision-language architecture, Sarvam Vision focuses on document intelligence, combining OCR, layout understanding, and visual reasoning across India’s diverse scripts and languages.
In benchmark tests, Sarvam Vision topped olmOCR-Bench with 84.3% accuracy on English documents and delivered an average word accuracy of 87.36% across 22 Indian languages, outperforming models such as Google’s Gemini 3 Pro and other leading global systems on Indic OCR tasks. The model is capable of parsing complex PDFs, scientific papers, historical scans, tables, formulas, charts, and mixed-layout documents — areas where conventional OCR systems often struggle.
Trained using advanced techniques to improve reliability and semantic understanding across text and visuals, Sarvam Vision has shown strong performance in languages including Hindi, Bengali, Tamil, Telugu, Marathi, Malayalam, Kannada, Gujarati, Punjabi, Urdu, and Assamese. Founded in 2023 by veterans of India’s Aadhaar digital identity project, Sarvam AI is offering free access to its Document Intelligence APIs and Vision experience until February 2026, aiming to accelerate the development of Indic-first AI applications.
For Unparalleled coverage of India's Businesses and Economy – Subscribe to Business Today Magazine