DeepSeek Allegedly Shifts from OpenAI to Gemini for Training New AI Models

The company however has not disclosed its training data sources

DeepSeek Allegedly Shifts from OpenAI to Gemini for Training New AI Models
(Image-Reuters)

Chinese AI firm DeepSeek recently released an updated version of its reasoning model R1, which demonstrated strong performance on math and coding benchmarks.

While the company has not disclosed its training data sources, some researchers suspect the model may have been trained using outputs from Google’s Gemini AI.

Melbourne-based developer Sam Paech claims that R1-0528 exhibits language patterns similar to Gemini 2.5 Pro, suggesting possible data overlap.

Another developer, known as the creator of “SpeechMap,” noted that the model’s internal reasoning patterns resemble those of Gemini.

In March, Google unveiled Gemini 2.5, its latest AI model designed to handle complex reasoning and coding tasks. This release includes the Gemini 2.5 Pro Experimental, which has secured the top spot on the LMArena leaderboard and excels in various coding, math, and science benchmarks.

Interestingly, DeepSeek has previously faced allegations of using data from rival AI models in its training processes. Last year, developers noticed that its V3 model frequently referred to itself as ChatGPT—OpenAI’s chatbot—raising suspicions that it may have been trained on ChatGPT conversation logs.

Earlier this year, OpenAI told the Financial Times it had uncovered evidence linking DeepSeek to "distillation," a method of training smaller models using outputs from more advanced ones.