News sites block the Internet Archive to forestall AI crawling. Is the “open network” closing?

February 5, 2026

13

When the World Wide Web got here into operation within the early Nineteen Nineties, its founders were hoped It can be a spot where everyone can share information and collaborate. But today the free and open web is shrinking.

The Internet Archive has chronicled the history of the Internet and made it available to the general public Wayback machine since 1996. They are actually a number of the largest news agencies on this planet blocking the archive's access to their pages.

Major publishers – including The Guardian, The New York Times, The Financial Times and USA Today – have confirmed they’re ending Internet Archive access to their content.

While say publishers They support the archive's preservation mission, arguing that unrestricted access creates unintended consequences and exposes journalism to AI crawlers and members of the general public trying to bypass its paywalls.

Nevertheless, publishers don’t want to easily exclude AI crawlers. Rather, they need it sell their content to data-hungry technology corporations. Their back catalogs of stories, books and other media have change into one hot commodity as data for training AI systems.

Robot reader

Generative AI systems like ChatGPT, Copilot and Gemini require access to large content archives (e.g. media content, books, art and academic research). Training and to Respond to user prompts.

Publishers claim tech corporations have accessed much of this content at no cost and without the permission of copyright holders. Some began taking tech corporations to court, claiming that they had stolen their mental property. Well-known examples are: The New York Times' Lawsuit against ChatGPT's parent company OpenAI and News Corp lawsuit against Perplexity AI.

The New York Times has sued OpenAI for alleged copyright infringement.
Sarah Yenesel/EPA

Old news, latest money

In response, some tech corporations have responded beaten Offers to pay for access to publishers' content. NewsCorp's contract is reportedly with OpenAI price greater than $250 million over five years.

Similar agreements have been reached between academic publishers and technology corporations. Publishers resembling Taylor & Francis and Elsevier have been criticized prior to now for locking publicly funded research behind business paywalls.

Now, Taylor & Francis has signed a $10 million non-exclusive cope with Microsoft, giving the corporate access to over 3,000 magazines.

Publishers also use it Technology to stop unwanted AI bots Access to their content, including the crawlers utilized by the Internet Archive to record Internet history. News publishers consult with the Internet Archive as “Back door” into their catalogs, allowing unscrupulous tech corporations to proceed deleting their content.

A person browses the Internet Archive on a laptop — The Internet Archive has been systematically archiving the online for about three many years.
Serene Lee/SOPA Images/LightRocket via Getty Images

The cost of providing news at no cost

The Wayback Machine was also utilized by members of the general public Avoid newspaper paywalls. Understandably, the media wants readers to pay for news.

News is a business, and that's what it’s Advertising revenue model has come under increasing pressure from the identical tech corporations that use news content to coach and retrieve AI. However, this comes on the expense of public access to credible information.

When newspapers began putting their content online and making it freely available to the general public within the late Nineteen Nineties, they contributed to the ethos of sharing and collaboration of the early web.

However, one commentator subsequently described the free access as “original sin“of online news. The public became accustomed to receiving their digital editions at no cost, and as online business models modified, many medium and small news organizations struggled to finance their operations.

The opposite approach – putting all business news behind paywalls – has its own problems. When news publishers move Subscription-only modelspeople must juggle multiple expensive subscriptions or limit theirs Appetite for news. Otherwise, they’re left with news that continues to be free online or spread via social media Algorithms. The result’s a more closed, more business Internet.

This just isn’t the primary time that the Internet Archive has been within the crosshairs of publishers, because the organization has previously been sued and located to be infringing copyright through its Open Library project.

The past and way forward for the Internet

The Wayback Machine has served because the Internet's public record for greater than three many years, utilized by researchers, educators, journalists, and amateur Internet historians.

Blocking access to major international newspapers will leave significant gaps within the Internet's public record.

Today you need to use this Wayback machine to see the June 1997 front page of The New York Times: the primary time the Internet Archive searched the newspaper's website. In one other 30 years, Internet researchers and curious residents will not have access to today's front page, even when the Internet Archive still exists.

Today's web sites change into tomorrow's historical records. Without the preservation efforts of nonprofit organizations like The Internet Archive, we risk losing vital records.

Despite the actions of business publishers and the emerging challenges of AI, nonprofit organizations just like the Internet Archive and Wikipedia need to keep the dream of an open, collaborative and transparent Internet alive.

News sites block the Internet Archive to forestall AI crawling. Is the “open network” closing?

Robot reader

Old news, latest money

The cost of providing news at no cost

The past and way forward for the Internet

LEAVE A REPLY Cancel reply

Must Read

After backlash, Adobe cancels shutdown of Adobe Animate and puts app into “maintenance mode”

Brian Hedden named associate dean for social and ethical responsibility in computer science

“Vaccination” helps people detect political deepfakes, study says

An “AI life after death” is now an actual option – but what is going to occur to your legal status?

Katie Spivakovsky wins the 2026 Churchill Scholarship

AI is coming to the Olympic jury: what makes it groundbreaking?

SMART launches recent research group “Wearable Imaging for Transforming Elderly Care”.

Latest articles

After backlash, Adobe cancels shutdown of Adobe Animate and puts app into “maintenance mode”

Brian Hedden named associate dean for social and ethical responsibility in computer science

“Vaccination” helps people detect political deepfakes, study says

Our Newsletter

News sites block the Internet Archive to forestall AI crawling. Is the “open network” closing?

Robot reader

Old news, latest money

The cost of providing news at no cost

The past and way forward for the Internet

RELATED ARTICLES

LEAVE A REPLY Cancel reply

Must Read

Latest articles

Our Newsletter