BBC threatens AI firm with legal action over unauthorised content use

52 ColinWright 18 6/20/2025, 12:36:44 PM bbc.co.uk ↗

Comments (18)

simonw · 6h ago

It looks to me like this is mainly about RAG - Perplexity answers user questions by running searches and then displaying content from those searches to users, and the BBC are arguing that this content display violates their copyright.

Unsurprisingly this article confuses the issue somewhat by also talking about training models on content. I understand why that's in there - it's a hot topic, especially in the UK right now - but I don't think it's directly relevant to this complaint.

The note about robots.txt is interesting - "The BBC said in its letter that while it disallowed two of Perplexity's crawlers, the company "is clearly not respecting robots.txt".

Perplexity describe their user-agents here: https://docs.perplexity.ai/guides/bots

I had a look at https://www.bbc.com/robots.txt and it does indeed block both PerplexityBot ("designed to surface and link websites in search results on Perplexity" - I think that's their search index crawler) and Perplexity-User ("When users ask Perplexity a question, it might visit a web page to help provide an accurate answer and include a link to the page in its response").

But... I checked the Internet Archive for a random earlier date - Feb 2025 - https://web.archive.org/web/20250208052005/https://www.bbc.c... - and back then the BBC were blocking PerplexityBot but not Perplexity-User.

hadrien01 · 6h ago

They also write this:

> Since a user requested the fetch, this fetcher generally ignores robots.txt rules.

bitpush · 3h ago

> Since a user requested the fetch, this fetcher generally ignores robots.txt rules.

Normally the expecation is that the user-agent faithfully presents the content it fetched.

If I make a browser that fetches bbc.com, and strips away ads and presented it to users - I would expect BBC to not like it and block the user-agent from accessing it. It isnt a robots.txt thing. It is a user-agent thing.

simonw · 5h ago

Oh wow, I missed that! That's from the docs for that Perplexity‑User user-agent, at which point presumably there's no point in listing that in robots.txt at all?

dabeeeenster · 5h ago

I mean, that's just not true.

esskay · 4h ago

Which part? It's widely established and known that many AI crawlers are ignoring the robots.txt file, perplexity being one of them [1]

[1]https://www.tomshardware.com/tech-industry/artificial-intell...

whilenot-dev · 5h ago

For what its worth, this statement here regarding Perplexity-User:

> Since a user requested the fetch, this fetcher generally ignores robots.txt rules.

...has been added sometime between 30.01.2025[0] and 07.02.2025[1], and makes it sound like robots.txt was not respected by that bot anyways.

[0]: https://web.archive.org/web/20250130164401/https://docs.perp...

[1]: https://web.archive.org/web/20250207113929/https://docs.perp...

simonw · 5h ago

Great catch there.

seydor · 5h ago

> In a statement, Perplexity said: "The BBC's claims are just one more part of the overwhelming evidence that the BBC will do anything to preserve Google's illegal monopoly."

Unless perplexity has a way to indirectly pay writers the way google does, this is very rich

> four popular AI chatbots - including Perplexity AI - were inaccurately summarising news stories, including some BBC content.

One of the interesting things about the failures of LLMs is that news sources have become more concise and more authoritative. Even google fails to get facts right with its AI summaries, so one is compelled even more to go read the website instead. And I'm not sure if LLMs will ever be able to grasp true from lies.

fcatalan · 5h ago

To be honest not visiting some websites is one of my main uses of Perplexity.

For example I like to watch F1 and I like to know the times for all sessions in my timezone during the weekend.

It's surprisingly hard to find this information, because the Google search is SEOed to hell and back by sites that hide the information behind endless articles full of irrelevant AI slop and 2 million intrusive ads, and that's if they have it right or at all.

Perplexity wades through all that shit, gives me a neatly formatted table and has never been wrong so far.

So I can see where the BBC is coming from but I also don't really want them to win.

bitpush · 3h ago

> To be honest not visiting some websites is one of my main uses of Perplexity.

I use it the same way as well, but everytime I use it .. I feel icky. A sense of impending doom.

Imagine a book summaries service, that helped users not buy any books ever. What is the incentive for a writer to write a book, when they know that in ~mins, the summary of the work will be available on a different site.

News sites are unique in that the value they provide, for the most part, is the realtime-ness of it. BBC reporting on latest in London is the work of soo many journalists and if Perplexity sidesteps that - BBC has no incentive (and in the future, money) to do that work. It kills BBC, and it ultimately kilss Perplexity.

So yes, Perplexity is playing a very dangerous short term game, and BBC is right in suing them.

> BBC is coming from but I also don't really want them to win.

If BBC doesnt win, BBC (and other sites that "produce" information) dies which kills Perplexity.

esskay · 6h ago

> In a statement, Perplexity said: "The BBC's claims are just one more part of the overwhelming evidence that the BBC will do anything to preserve Google's illegal monopoly."

That's got to be the most delusional response they could've given. It's not BBC or any other news publishers job to preserve Google's monopoly. The comparison would only even work if Google was replacing a link to a BBC article in the search results with a direct copy of said article on the Google search results page.

oneeyedpigeon · 6h ago

I'd love to see some—any—of this "overwhelming evidence". I suspect it does not exist. I'd also love to ask Perplexity why they think the BBC would have any kind of bias toward Google, it just doesn't make any sense.

randall · 6h ago

this is the most non sequitur press statement ever.

josefritzishere · 6h ago

Good. I hope BBC gets a historically large judgement and Google has to learn a valulable lesson.

bitpush · 3h ago

How's BBC lawsuit against Perplexity affect Google? Did you not read the article?

riskable · 5h ago

How is Perplexity different from running a Jupyter Notebook or anything, really that lets you download a web page programmatically? I can spin up an AWS instance, login then run `python` and scrape the BBC's content as much as I want. Why aren't they suing Amazon (and every other company that lets you download stuff via their systems) for providing the same functionality?

A very old argument: If you don't want people scraping or downloading your content don't put it on the (public) Internet!

Imagine we had LLM-like functionality in the 1980s: Sony announces a new VCR that can read a recorded news show and print out a summary on a connected Imagewriter II. People start using it to summarize the publicly-broadcast BBC news programs.

Today's scenario would be like the BBC sues Sony for providing that functionality.

ethbr1 · 4h ago

Because copyright is intrinsically linked to scale.

1000000x'ing fair use... might no longer be fair use.

The balances between society and copyright need to change when scale changes drastically.

To address the elephant in the room -- what happens when there are only leachers and no sources, because we've let them hijack first-party news revenue without creating a replacement?

Ask HN: X account hacked again – no email when attacker changed the email? How?

Ask HN: What do you think about app native vs. portable look-and-feel?

Stripe alternative for India for payment processing

Ask HN: What is the equivalent to Win32 on Linux

Free Virtual CS Classes and Tutoring

Ask HN: What newspaper are you paying for these days?

Ask HN: How can we keep (part of) the web human?

Ask HN: Advice about transitioning to remote role?

Ask HN: Tips for hiring? It has been difficult

Ask HN: AI agents and the future of UI/UX design. Opinions?

Ask HN: Is AI 'context switching' exhausting?

Ask HN: Data engineers, What suck when working on exploratory data-related task?

BMW ConnectedDrive lets me control my returned rental car (Sixt)

Ask HN: Is cloud infra making us forget the local file system and memory?

I would enjoy an HN chat. Is there one?

Is there a way to run an LLM as a better local search engine?

Ask HN: Tech people who are self employed. How do you do it?

Ask HN: For a team experienced with LLMs – Any concrete reason to use LangGraph?

Ask HN: What cool skill or project interests you, but feels out of reach?

Khalifa University and Knowledge E to Organise AI Futures Summit in Abu Dhabi

Ask HN: What happens post ESOPs vesting period is over at a startup?

Tell HN: Help restore the tax deduction for software dev in the US (Section 174)

Ask HN: How do I give back to people helped me when I was young and had nothing?

Is GitHub Down?

Ask HN: What is your fallback job if AI takes away your career?

Ask HN: How to learn CUDA to professional level

Ask HN: In a guide to inner work for founders and engs, what topics to cover?

"A Crowd-Driven Platform That Lets People Vote

Ask HN: How to Deal with a Bad Manager?

Ask HN: How should I spend 10 weeks delving into AI?

Ask HN: Is there an AI bot that works like a literate programming build step

BBC threatens AI firm with legal action over unauthorised content use

Comments (18)