Google says its flagship artificial intelligence chatbot, Gemini, has been inundated with “commercially motivated” attackers who repeatedly prompt it, sometimes requesting thousands of different queries, in an attempt to clone it (including one campaign that prompted Gemini more than 100,000 times).
In a report released Thursday, Google said chatbots are increasingly being subjected to “distillation attacks,” or repeated questions, aimed at forcing them to reveal their inner workings. Google describes this activity as “model extraction,” in which would-be imitators examine a system to find patterns and logic that make it work. Attackers appear to be using that information to build or enhance their own AI.
The company believes most of the culprits are private companies or researchers looking to gain a competitive advantage. A spokesperson told NBC News that Google believes the attacks originated from around the world, but declined to provide further details about what is known about the suspects.
John Hultquist, principal analyst in Google’s Threat Intelligence Group, said the scope of the attack against Gemini shows that it is likely to be, or soon will be, common to small businesses’ custom AI tools as well.
“We’re going to be the canary in the coal mine and there will be more incidents,” Hultquist said. He declined to name the suspect.
The company said it considers the distillation to be theft of intellectual property.
Technology companies are spending billions of dollars racing to develop AI chatbots, or large-scale language models, and consider the inner workings of their top models to be invaluable proprietary information.
Although leading LLMs have mechanisms in place to identify distillation attacks and attempt to block those behind them, they are inherently vulnerable to distillation because they are open to anyone on the internet.
OpenAI, which runs ChatGPT, last year accused Chinese rival DeepSeek of conducting distillation attacks to improve its models.
Google said many of the attacks were created to unravel Gemini’s “reasoning,” or the algorithms that help it decide how to process information.
Hultquist said that as more companies design their own custom LLMs trained on potentially sensitive data, they become vulnerable to similar attacks.
“Let’s say your LLM has been trained in 100 years of secret thinking about how to trade. In theory, you can extract some of that,” he said.
