Sarvam AI Features In India: India’s sovereign AI journey is no longer just about big global companies opening offices or data centres here. At a time when it felt like India was still behind companies like OpenAI and Google, a Bengaluru-based startup surprised everyone. The startup, called Sarvam AI, has marked a new and crucial step in India’s technology story by creating a new home-grown AI models specially made for Indian users.

A Bengaluru-based startup, Sarvam AI, has introduced two new AI models, Bulbul V3 and Sarvam Vision. What makes this development special is that these models have outperformed popular tools like ChatGPT and Google Gemini in tasks such as reading and understanding text from images, a process known as optical character recognition (OCR). This achievement highlights that India is now developing strong and advanced AI solutions on its own.

Drop 6/14: @SarvamAI is proud to announce a landmark in India’s sovereign AI journey through strategic partnerships with the Governments of Odisha and Tamil Nadu. The aim of these partnerships is to drive transformation by building at-scale compute, sovereign models, and the… pic.twitter.com/Scx9mK6CPw — Pratyush Kumar (@pratykumar) February 9, 2026

Add Zee News as a Preferred Source

For a long time, discussions around AI large language models (LLMs) in the tech world have been dominated by the US and China. Despite India’s vast talent pool and massive AI market, the absence of a locally developed AI model had often raised questions about the country’s position in the global AI race.

Sarvam AI beats ChatGPT, Gemini 3 Pro and DeepSeek

Sarvam AI has recently gained attention for delivering stronger results than several leading global AI models across key benchmarks. Its OCR solution, Sarvam Vision, secured the top position on the olmOCR-Bench with an accuracy of 84.3 percent, outperforming well-known tools such as ChatGPT, Gemini 3 Pro, and DeepSeek OCR v2.

The model also achieved a high score of 93.28 percent on OmniDocBench v1.5, showing its ability to accurately handle complex page layouts, technical tables, and mathematical equations. These are areas that often challenge traditional OCR systems. In addition, Sarvam AI has proven to be reliable for everyday tasks, including scanned documents, forms, and content in multiple languages.

Bulbul V3 Features

The Bulbul V3, Sarvam AI’s text-to-speech model, supports 35 different voices drawn from 22 official Indian languages, covering content from as early as the 1800s to the present day. It is designed to handle varying scan qualities and diverse types of content with accuracy. The series also includes a 3B-parameter state-space vision-language model that can perform advanced visual understanding tasks such as image captioning, scene text recognition, chart analysis, and the interpretation of complex tables.

Drop 5/14: Introducing Bulbul V3, our latest text-to-speech model. It raises the bar for how human it sounds, while being super robust.



In an independent third-party human listening study, Bulbul V3 delivers the highest listener preference, and low error rates across use-cases… pic.twitter.com/w7HThWzuKe — Pratyush Kumar (@pratykumar) February 7, 2026

Sarvam Vision and Bulbul V3: How it works

Sarvam Vision is designed as an India-first AI model that understands the country’s wide cultural and language diversity. It aims to build a strong AI foundation developed within India, making it a promising option for use in government projects, public infrastructure, and the BFSI sector.

We also evaluated for the long-tail of language challenges such as speaking numerics, technical content, and named entities. Bulbul V3 consistently has the lowest error rates across languages. pic.twitter.com/1COxQU80J7 — Pratyush Kumar (@pratykumar) February 7, 2026

Meanwhile, Bulbul V3 is Sarvam AI’s flagship text-to-speech model, built to handle India’s rich and complex language diversity. It marks a big step forward in creating natural, ready-to-use AI voices across multiple Indian languages.

One of its standout features is its smooth language switching, allowing it to move easily between languages like Tamil and English or Hindi and English without any disruption. Currently, the Bulbul V3 supports 11 Indian languages with over 35 voices, and Sarvam AI plans to add 22 more Indian languages in the future.