Reddit to update web standard to block automated website scraping

📆 6/25/2024 8:15 PM
📰 globeandmail

⏱ Reading Time:
28 sec. here
49 min. at publisher
📊 Quality Score:
News: 179%
Publisher: 92%

Canadian News News

Canada News,Breaking News Video,Canadian Breaking News

The move comes at a time when artificial intelligence firms have been accused of plagiarizing content from publishers to create AI-generated summaries without giving credit or asking for permission

Social media platform Reddit said on Tuesday it will update a web standard used by the platform to block automated data scraping from its website, following reports that AI startups were bypassing the rule to gather content for their systems.

Reddit said that it would update the Robots Exclusion Protocol, or “robots.txt,” a widely accepted standard meant to determine which parts of a site are allowed to be crawled. More recently, robots.txt has become a key tool that publishers employ to prevent tech companies from using their content free-of-charge to train AI algorithms and create summaries in response to some search queries.

This follows a Wired investigation which found that AI search startup Perplexity likely bypassed efforts to block its web crawler via robots.txt.

Write Comment

We have summarized this news so that you can read it quickly. If you are interested in the news, you can read the full text here. Read more:

Loans Loans Latest News, Loans Loans Headlines