OpenAI Launches IndQA: A New Benchmark for Indian Languages and Culture
OpenAI built IndQA in partnership with 261 Indian experts—journalists, linguists, scholars, artists and industry practitioners.
OpenAI has unveiled a new benchmark called IndQA designed to evaluate how well AI models understand and reason about Indian languages and culture.
The move reflects OpenAI’s mission to “make AGI benefit all of humanity” and acknowledges that around 80 % of the world’s population does not speak English as their primary language.
"India has about a billion people who don’t use English as their primary language, 22 official languages (including at least seven with over 50 million speakers), and is ChatGPT’s second largest market," OpenAI said in a blog post.
IndQA covers 2,278 expert-authored questions spanning 12 languages—including Bengali, Hindi, Kannada, Malayalam, Marathi, Odia, Telugu, Gujarati, Punjabi, Tamil, Hinglish and English—and 10 cultural domains such as Arts & Culture, Food & Cuisine, History, Law & Ethics, Literature & Linguistics, Media & Entertainment, Religion & Spirituality, Everyday Life and Sports.
OpenAI built IndQA in partnership with 261 Indian experts—journalists, linguists, scholars, artists and industry practitioners—to help probe reasoning-heavy, culturally nuanced tasks that existing multilingual benchmarks struggle to capture.
The company also used adversarial filtering—retaining only questions on which its strongest models (like GPT-4o, GPT-4.5) initially failed—to ensure headroom for progress.
OpenAI notes that while its models have improved on Indian languages over time, “there is still substantial room for improvement.”
The firm hopes IndQA will serve as a “north star” for future benchmark creation in under-served languages and cultural domains.
Comments ()