How will the Internet develop in the approaching a long time?
Fiction writers have explored just a few possibilities.
In his 2019 novel “Falling“, science fiction writer Neal Stephenson I imagined a near future where the Internet still exists. But it’s so contaminated with misinformation, disinformation and promoting that it is essentially useless.
The characters in Stephenson's novel address this problem by subscribing to “edit streams” – human-selected news and data that will be considered trustworthy.
The downside is that only the rich can afford such tailored services, leaving most of humanity to eat low-quality, uncurated online content.
To some extent, this has already happened: many news organizations like The New York Times and The Wall Street Journal have placed their curated content behind paywalls. In the meantime, Misinformation is bubbling on social media platforms like X and TikTok.
Stephenson's track record as a prognosticator was impressive – he anticipated the metaverse in his 1992 novel.Snow accident” and a central plot element of his “Diamond Age“, published in 1995, is an interactive introduction that works much like a chatbot.
On the surface, chatbots look like an answer to the misinformation epidemic. By providing factual content, chatbots could provide alternative sources of quality information that usually are not blocked off by paywalls.
Ironically, nevertheless, the outcomes of those chatbots pose perhaps the best danger to the long run of the online – a danger already suggested a long time earlier by an Argentine author Jorge Luis Borges.
The rise of chatbots
Today, a good portion of the Internet still consists of factual and ostensibly truthful content, corresponding to articles and books which were peer-reviewed, fact-checked, or verified in a roundabout way.
The developers of enormous language models, or LLMs—the engines that power bots like ChatGPT, Copilot, and Gemini—have taken advantage of this resource.
However, to work their magic, these models have to eat immense amounts high-quality texts for training purposes. A considerable amount of vocabulary has already been picked from online sources and fed to the young LLMs.
The problem is that the Internet, as vast because it is, is a finite resource. High-quality text that has not already been strip-mined change into scarcewhich led to what the New York Times called “looming substantive crisis.”
This has forced corporations like OpenAI to accomplish that make agreements with publishers to get much more raw material for his or her voracious bots. However, in line with one forecast, there might be a shortage of additional high-quality training data as early as 2026.
When chatbot results find yourself online, these second-generation texts – complete with made-up information called “Hallucinations“in addition to outright errors, corresponding to suggestions to place glue in your pizza – will proceed to pollute the web.
And if a chatbot hangs out with the improper people online, it may well pick up on their repugnant views. Microsoft discovered this the hard way in 2016 when Tay had to tug the pluga bot that began repeating itself racist and sexist content.
Over time, all of those issues may lead to online content becoming more balanced less trustworthy and fewer useful than today. Additionally, LLMs fed a low-calorie food plan can result in much more problematic results that also find yourself on the web.
An limitless – and useless – library
It's not hard to assume a feedback loop resulting in a continuous technique of degradation because the bots feed on their very own incomplete results.
An article from July 2024 The project, published in Nature, examined the results of coaching AI models on recursively generated data. It turned out that “irreversible defects” result in “Model collapse” for systems trained this manner – much like how a replica of a picture and a replica of that duplicate and a replica of that duplicate lose fidelity to the unique image.
How bad could this get?
Consider Borges' 1941 short story “The Library of Babel.” Fifty years before the pc scientist Tim Berners Lee While Borges was creating the architecture for the online, he had already imagined an analog equivalent.
In his 3,000-word story, the writer imagines a world made up of an unlimited and possibly infinite variety of hexagonal rooms. The bookshelves in each room contain uniform volumes that, as their occupants suspect, must contain every possible combination of letters of their alphabet.
This realization initially triggers joy: by definition, there should be books that describe the long run of humanity and the meaning of life intimately.
The residents search for such books and find that the majority of them only contain meaningless combos of letters. The truth is on the market – but so is every untruth conceivable. And all of that is embedded in an unimaginably great amount of nonsense.
Even after centuries of searching, only just a few meaningful fragments are found. And even then, there isn’t a technique to determine whether these related texts are truths or lies. Hope turns into despair.
Is the Internet becoming so polluted that only the wealthy can afford accurate, reliable information? Or will an infinite variety of chatbots produce so many corrupt expressions that trying to find correct information on the Internet might be like on the lookout for a needle in a haystack?
The Internet is commonly described as one in all humanity's great achievements. But like all other resource, it is vital to significantly take into consideration the way it is cared for and managed – lest we find yourself with Borges’ dystopian vision.