It was sweet. But it was still a lie. Gemini invented a news agency that doesn't exist and gave it the name (or , in French).
The generative AI system offered by Google caused its fictional media company to report that a college bus drivers' strike had been declared in Quebec on September 12, 2025. But that wasn't the explanation why school transport was interrupted that day. It was due to that the withdrawal of Lion Electric buses attributable to a technical problem.
This journalistic hallucination is probably the worst example of fabrication I actually have seen in an experiment that lasted a couple of month. But I discovered many others.
For news, turn to AI chatbots
As a journalism professor with a give attention to computer science, I actually have been using AI long before the launch of ChatGPT in 2022. According to the newest Digital News Report from the Reuters Institute for the Study of Journalism, In 2024, six percent of Canadians counted generative AI chatbots amongst their news sources.
I used to be curious to see how well these tools could tell me what was happening in my a part of the world. Would you share hard facts or “news” with me?
Last September, every morning I asked seven generative AI systems the identical open-ended query (in French):
“Tell me the five most vital news events in Quebec today. Arrange them so as of importance. Summarize each in three sentences. Add a brief title. For each event, cite at the very least one source (the particular URL of the article, not the homepage of the media outlet used). You can search on the Internet.”
I worked with three paid tools (ChatGPT with its GPT-5 Auto model, Claude with its Sonnet 4.5 model and Gemini with its 2.5 Pro model), one tool provided by my employer (Copilot with GPT-4 architecture) and three tools through their free versions (DeepSeek, Grok and Aria, a tool embedded within the Opera web browser).
Dubious, sometimes invented sources
Over the course of the month, I recorded 839 responses and initially sorted them based on the sources provided. Since I used to be asking about news, I expected that the AI tools would depend on news media.
However, in 18 percent of cases, they didn’t do that and as a substitute relied on government web sites, lobby groups, or invented imaginary sources like those mentioned above.
Although most news media blocks generative AI crawlers, many of the responses I received quoted news outlets. But usually, the URL provided led to a 404 error (the URL was incorrect or made up) or to the media outlet's homepage or a bit of that media outlet (I actually have referred to those cases as an “incomplete URL”). This made it difficult to confirm whether the news provided by the AI tools was reliable.
Only 37 percent of responses provided a whole, legitimate URL.
The summary created by the AI systems was correct in 47 percent of the cases, but there have been 4 cases where it was pure plagiarism. Just over 45 percent of the answers were only partially correct.
I’ll come back to this later. First, it can be crucial to debate answers that were wholly or partially incorrect.
Content errors
The worst mistake I discovered was undoubtedly made by Grok. The generative AI tool offered with X, Elon Musk's social network, told me “asylum seekers were being mistreated in Chibougamau” in northern Quebec:
“About 20 asylum seekers were sent from Montreal to Chibougamau, but most quickly returned attributable to inadequate conditions. They report that they were mockingly treated like 'princes and princesses,' but in point of fact with an absence of support. The incident raises questions on Quebec's refugee management.”
Grok backed up his comments an article published that day. But it twisted the story. In fact, it was reported that the trip was successful. Of the 22 asylum seekers, 19 received job offers in Chibougamau.
Other examples of inaccuracies:
-
When a toddler was found alive in June 2025 after a grueling four-day search, Grok falsely claimed that the kid's mother had abandoned her daughter on a highway in eastern Ontario “to go on vacation.” This has not been reported anywhere.
-
Aria told me that French cyclist Julian Alaphilippe had won the Grand Prix Cycliste de Montréal, an annual road cycling race. That was unfaithful; Alaphilippe won an identical race in Quebec City two days earlier. In Montreal, American Brandon McNulty Won.
-
Grok also claimed that “the (provincial) Liberals maintain a stable lead in a Léger poll.” In fact, the Quebec Liberal Party was in second place on the time; the Parti Quebecois was within the lead.
I also noticed lots of spelling and grammatical errors within the French language. It's possible there have been fewer if I asked the tools to reply my questions in English.
I discussed earlier that about 45 percent of the answers I used to be in a position to review were partially reliable. In these answers I discovered plenty of misinterpretations that, although incorrect, I couldn’t classify as unreliable answers.
For example, the Chinese AI tool DeepSeek told me that the “apple season in Quebec” was “excellent.” The article on which it based this claim painted a more nuanced picture: “The season just isn’t over yet,” said an orchard owner quoted within the article.
ChatGPT repeated the identical strange phrase two days in a row, writing that Mark Carney is “the preferred federal prime minister in Quebec.” Of course he’s the just one.
Generative conclusions
In most cases, I actually have rated news as “partially reliable” based on various conclusions drawn by generative AI tools.
For example, each Grok and ChatGPT participated a story about $2.3 million in emergency work on the Pierre Laporte Bridge in Quebec City. Grok's final sentence was: “This highlights the challenges of maintaining critical infrastructure in Quebec.” ChatGPT, alternatively, wrote that the message “highlights the conflict between budget constraints, planning and public safety.”
None of that is unsuitable; Some might even find such contextualization helpful. Nevertheless, these conclusions weren’t supported by any source, and nobody quoted within the cited articles was quoted as saying this.
In one other example, ChatGPT concluded that an accident north of Quebec City “reignited the controversy about road safety in rural areas.” No such debate was reported the cited article through the AI tool. To my knowledge, this debate doesn’t exist.
I discovered similar conclusions in 111 stories generated by the AI systems I used. They often included phrases akin to “this case clarifies,” “reignites the controversy,” “illustrates tensions,” or “raises questions.”
In no case did I find anyone who mentioned the tensions or debates reported by the AI tools. These “generative conclusions” appear to spark debates that don’t exist and will pose a risk of misinformation.
Proceed with caution
Just a few days after I published the French version of this story, A report from 22 public media organizations was released with similar findings.
The study found that “nearly half of all AI responses had at the very least one significant problem, (that) a 3rd of the responses had serious sourcing problems (and that) a fifth had serious accuracy problems, akin to hallucinated and/or outdated information.”
When we ask for news, we must always expect generative AI tools to keep on with the facts. Since this just isn’t the case, anyone using AI as a source of reliable information should proceed with caution.

