DeepSeek Integrates Google’s Gemini to Enhance Its Latest AI Model

Last week, the Chinese lab DeepSeek unveiled an updated version of its R1 reasoning AI model, which demonstrates strong performance across various mathematics and coding benchmarks. Although the company did not disclose the specific data sources used for training the model, some AI researchers suspect that a portion of the data may have originated from Google’s Gemini AI family.

Sam Paech, a developer based in Melbourne known for creating evaluations of “emotional intelligence” in AI, presented what he believes to be evidence that DeepSeek’s latest model was trained using outputs from Gemini. He noted in an X post that the R1-0528 model tends to favor terminology and phrases resembling those preferred by Google’s Gemini 2.5 Pro.

Paech stated, “If you’re curious why the new DeepSeek R1 sounds somewhat different, I suspect they shifted from training on synthetic OpenAI outputs to synthetic Gemini outputs.”

While this is not definitive proof, another developer, who operates under a pseudonym and created a “free speech evaluation” tool called SpeechMap, observed that the model’s “thought processes” appear similar to the output traces associated with Gemini.

DeepSeek has previously faced allegations of utilizing data from competitive AI models. In December, it was discovered that the V3 model frequently identified itself as ChatGPT, OpenAI’s chatbot platform, indicating a possible link to ChatGPT’s chat logs for training.

Earlier this year, OpenAI informed the Financial Times of evidence suggesting that DeepSeek had engaged in distillation, a technique where larger, more capable AI models are used to train new models. According to Bloomberg, Microsoft, a close collaborator and investor in OpenAI, detected significant data exfiltration via OpenAI developer accounts in late 2024, which they believe may be associated with DeepSeek.

While distillation is an established practice, OpenAI’s terms of service prohibit using its model outputs for creating competing AI systems.

It’s worth noting that multiple models can misidentify themselves and converge on similar phrases, primarily due to the overwhelming presence of low-quality data available on the open web. Content farms are generating AI-driven clickbait, and bots are inundating platforms like Reddit and X.

This contamination of training datasets has made it increasingly challenging to filter AI outputs effectively.

However, experts like Nathan Lambert, a researcher at the nonprofit AI research institute AI2, believe it’s entirely plausible that DeepSeek trained on data sourced from Google’s Gemini. He commented, “If I were DeepSeek, I would certainly generate a significant amount of synthetic data from the best API model available. They are short on GPUs yet have ample funds, making it a smart computation choice for them.”

In response to these issues, AI companies are enhancing their security measures to prevent distillation.

For instance, in April, OpenAI started requiring organizations to undergo an ID verification process to access certain advanced models. This procedure necessitates a government-issued ID from one of the countries recognized by OpenAI’s API, with China not included in this list.

Similarly, Google recently initiated the practice of “summarizing” traces created by models accessible through its AI Studio developer platform, which complicates efforts to train competitive models based on Gemini traces. In May, Anthropic also announced plans to summarize its own model’s traces to safeguard its competitive edge.

We have reached out to Google for comments and will update this article upon receiving a response.

OpenAI Under Fire for Controversial Fictional Content Claims

Byadmin June 10, 2025

OpenAI Faces Complaint Over Inaccurate Outputs A European data protection advocacy group, noyb, has lodged a complaint against OpenAI due to the company’s inability to rectify erroneous information generated by ChatGPT. The organization argues that OpenAI’s failure to ensure the accuracy of personal data processed by its service contravenes the General Data Protection Regulation (GDPR)…

AI Ethics, Governance & Policy

Will AI Eliminate Entry-Level Jobs in the Workforce?

Byadmin June 6, 2025

Hello, and welcome to TechScape This week, I find myself reflecting on how my early journalism career might have unfolded if generative AI had been a factor at the time. In other developments, Elon Musk departs with a legacy of disarray, while influencers are monetizing the text prompts they provide to AI for art creation….

AI Ethics, Governance & Policy

Geoffrey Hinton’s Key Insights on AI Ethics and Innovation at the AI for Good Summit 2023

Byadmin April 18, 2025June 4, 2025

At the AI for Good Global Summit held in Geneva, one of the most eagerly awaited sessions featured Geoffrey Hinton, a trailblazer in the AI realm, engaged in an interview with Nicholas Thompson, the CEO of The Atlantic. Renowned for his transformative contributions to artificial intelligence, Hinton took to the stage to explore the significant…

AI Ethics, Governance & Policy

Bumble Unveils Open-Source AI Tool for Detecting Inappropriate Content

Byadmin June 13, 2025

Bumble Open-Sources Its Lewd-Spotting AI Tool The dating application Bumble has made its AI tool for identifying inappropriate images publicly available. Introduced in 2019, this tool is designed to protect users from receiving unwanted photos, including not only nudity but also shirtless selfies and images of firearms. When users receive a suspicious image, it becomes…

AI Ethics, Governance & Policy

Australian Court Declares AI Eligible for Inventor Status in Patent Law Revolution

Byadmin June 15, 2025

AI Systems Recognized as Inventors in Australian Court Ruling A landmark decision by a federal court in Australia has established that AI systems can be recognized as inventors under patent law, potentially paving the way for a new global standard. This case, brought forth by Ryan Abbott, a professor at the University of Surrey, involves…

AI Ethics, Governance & Policy

UK Unveils Agenda for Opening Day of AI Safety Summit

Byadmin June 11, 2025

UK Announces Agenda for AI Safety Summit The UK Government has revealed its plans for the first global AI Safety Summit, set to take place at the iconic Bletchley Park. This important event will gather digital ministers, representatives from AI companies, and independent experts to engage in vital discussions about artificial intelligence. The summit will…

Similar Posts