Reddit Sues Anthropic Over Alleged Data Scraping to Train AI

Reddit sues Anthropic for allegedly scraping its content to train AI, claiming continued violations despite warnings and contrasting with licensed rivals.

ByMichael Morgenstern

Updated on

Anthropic Claude

In a lawsuit filed in California state court, Reddit Inc. has accused artificial intelligence company Anthropic PBC of systematically and unlawfully scraping Reddit’s content for years to train its large language model, Claude. Reddit claims Anthropic continued this practice even after being asked to stop, and in stark contrast to competitors like OpenAI and Google, which entered licensing agreements to use Reddit’s data.

The complaint, filed in San Francisco County Superior Court, portrays a company that publicly touts its ethical commitment to privacy while secretly violating Reddit’s terms of service. Reddit alleges that Anthropic’s dual identity—“a public face that attempts to ingratiate itself into the consumer’s consciousness with claims of righteousness… and the private face that ignores any rules”—undermines its users’ privacy and trust.

Reddit maintains that Anthropic’s use of deleted posts and personal content to train its AI directly violates the platform’s user agreement, which prohibits commercial exploitation of Reddit’s services or content without authorization.

Years of Unauthorized Access

According to the suit, Anthropic began scraping Reddit data as early as December 2021 and continued even after Reddit’s demands to stop. The data was allegedly used to improve Claude, Anthropic’s chatbot, which was released to the public in March 2023.

Reddit cited public statements from Anthropic’s own executives, including CEO Dario Amodei, acknowledging that Reddit comments are valuable for fine-tuning AI systems. One key allegation is that “Reddit's audit logs show that Anthropic continued to deploy its automated bots to access Reddit content more than one hundred thousand times in the subsequent months” after the company claimed to have stopped using crawlers.

Reddit's lawsuit also asserts that Claude itself admits to being trained on Reddit content when asked—a direct contradiction to public statements from Anthropic that they ceased data scraping.

Legal Claims and Requested Relief

Reddit’s five-count complaint includes claims of:

  • Breach of contract
  • Trespass to chattels
  • Unjust enrichment
  • Tortious interference with contractual relations
  • Unfair competition

The platform is seeking a range of remedies, including compensatory, punitive, and consequential damages, as well as disgorgement of Anthropic’s profits and attorneys’ fees. Additionally, Reddit demands specific performance—likely an injunction—to prevent Anthropic from continuing to access or use Reddit’s content without a license.

In a public statement, Reddit reaffirmed its belief in a free and open internet, but drew a hard line on unauthorized commercial exploitation. “This isn’t a misunderstanding,” the company said. “It’s a sustained effort to extract value from Reddit while ignoring legal and ethical boundaries.”

A Divided Approach Among AI Developers

A major point of contrast in Reddit’s filing is that both OpenAI and Google, Anthropic’s chief competitors, reached content licensing agreements with Reddit to train their AI models. Anthropic’s decision not to follow suit, Reddit argues, exemplifies a willful disregard for user rights and industry norms.

Anthropic, which is reportedly valued at $61.5 billion and backed by Amazon, Google, and Lightspeed Venture Partners, has stated it disagrees with Reddit’s claims and intends to "defend itself vigorously."

Related Legal Troubles

The lawsuit is one of several legal battles Anthropic faces over its data practices. Other high-profile cases involve claims from authors, journalists, and music publishers, such as Universal Music Publishing Group and Concord Music Group, who allege that Claude has infringed on their intellectual property by reproducing copyrighted content, including song lyrics.

In a separate incident, an attorney representing Anthropic admitted to using Claude to draft a declaration that included a fabricated citation. While the attorney called it “an honest citation mistake,” the court struck part of the declaration from the record.

The Law Firms Involved

Reddit is represented by Quinn Emanuel Urquhart & Sullivan LLP, with attorneys John B. Quinn, Morgan W. Tovey, Corey Worcester, and Stefan Berthelsen handling the case.

Counsel for Anthropic had not been listed at the time of filing.

The case is Reddit Inc. v. Anthropic PBC, case number CGC25625892, in the Superior Court of the State of California, County of San Francisco.

About the author

Michael Morgenstern

Michael Morgenstern

Michael is Senior Vice President of Marketing at The Expert Institute. Michael oversees every aspect of The Expert Institute’s marketing strategy including SEO, PPC, marketing automation, email marketing, content development, analytics, and branding.

background image

Subscribe to our newsletter

Join our newsletter to stay up to date on legal news, insights and product updates from Expert Institute.