Cloud

Neysa, Pipeshift Launch India-Based AI Inference Cloud

The partnership will address growing concerns among Indian enterprises over dependence on overseas AI infrastructure, rising API costs and latency issues tied to foreign-hosted inference services.

(Image-Freepik)

Neysa and Pipeshift have partnered to launch a production-grade AI inference platform hosted entirely within India, targeting enterprises that are increasingly deploying AI-powered voice agents, copilots and automation systems at scale.

The companies said the partnership will address growing concerns among Indian enterprises over dependence on overseas AI infrastructure, rising API costs and latency issues tied to foreign-hosted inference services.

India is rapidly emerging as one of the world’s largest inference-heavy AI markets, driven by expanding adoption of enterprise AI applications across customer support, software development, analytics and workflow automation.

The new platform combines Neysa’s AI Acceleration Cloud system, Velocis, with Pipeshift’s managed inference technology to offer dedicated, single-tenant inference environments for open-source AI models such as Llama, Mistral, DeepSeek, Gemma and Qwen through OpenAI-compatible APIs.

According to the companies, the infrastructure is optimised for latency-sensitive workloads including voice AI, enterprise search and reasoning systems, while ensuring prompts, inference traffic and enterprise data remain within India. The platform also supports speech-to-text, OCR and text-to-speech workloads.

“Scaling open-source models introduces a dual bottleneck: volatile token economics and high Time-to-First-Token (TTFT) driven by shared rate limits and cross-region routing. The upshot for enterprises is a seamless, OpenAI-compatible drop-in replacement that guarantees cold-starts, predictable and highly optimised token latency, and absolute sovereign data control at scale,” said Karan Kirpalani, Neysa Chief Product Officer.

“There is a clear line between AI that works in a demo and AI that works in production. Crossing that line takes more than a good model. It takes infrastructure that holds latency under load and keeps costs predictable at scale. That is the line our partnership with Neysa helps Indian companies cross,” said Arko Chattopadhyay, Pipeshift Co-Founder and CEO.

The companies said deployment timelines are typically under two weeks, with early enterprise customers already reporting significant reductions in inference latency for production AI workloads.

Neysa is a purpose-built AI Compute and Acceleration Cloud provider. Founded in 2023 by industry veterans Sharad Sanghi and Anindya Das, the company currently operates about 1,200 GPUs and plans a rapid scale-up, targeting more than 20,000 deployed units as demand for local AI compute accelerates.

Pipeshift is the dedicated inference platform for real-time AI workloads like voice, coding, and RAG agents.

Microsoft Shifts More AI Workloads To In-House MAI Models To Cut Costs

Meta Launches Muse Image AI Model To Power Image Creation Across Its Apps

Naukri Launches AI-Powered Recruitment Platform To Streamline Hiring

Meta Set To Launch Upgraded Muse Spark AI Model With Stronger Coding Capabilities

Read more

Microsoft Shifts More AI Workloads To In-House MAI Models To Cut Costs

Meta Launches Muse Image AI Model To Power Image Creation Across Its Apps

Naukri Launches AI-Powered Recruitment Platform To Streamline Hiring

Meta Set To Launch Upgraded Muse Spark AI Model With Stronger Coding Capabilities