The New York Times and Chicago Tribune sue Perplexity over alleged copyright infringement

    0

    The New York Times and Chicago Tribune have launched separate lawsuits against AI search startup Perplexity, accusing it of systematic copyright infringement by scraping millions of articles to train models and regurgitating content verbatim in responses. Despite cease-and-desist demands from the Times, Perplexity continued harvesting paywalled content in real-time, feeding it into tools like the Claude chatbot integration and Comet browser. These high-profile actions escalate the brewing war between media giants and AI firms over unauthorized use of journalistic output.

    Times’ Dual-Stage Infringement Claims

    The Times alleges Perplexity violates copyrights at ingestion and generation phases. First, web crawlers systematically scrape articles—including behind paywalls—for model training, bypassing robots.txt directives and terms of service. Second, queries trigger near-exact reproductions of Times stories, complete with bylines and formatting, undercutting traffic and ad revenue. Hallucinations compound damage, falsely attributing fabricated quotes or events to the newspaper, eroding brand trust.

    Perplexity’s “answer engine” circumvents traditional search economics by synthesizing and displaying source material directly, depriving publishers of referral clicks. The suit demands injunctions, statutory damages up to $150,000 per infringed work, and destruction of ingested datasets—potentially billions in liability given the article volume cited.

    Chicago Tribune Echoes Publisher Grievances

    The Tribune mirrors these accusations, claiming Perplexity copied “millions” of stories, videos, images, and works to power generative products. Outputs replicate Tribune content identically or substantially, enabling free access without compensation or attribution. Like the Times, the suit highlights real-time scraping and direct reproduction as core violations, seeking similar remedies to halt exploitation and recover losses from eroded digital subscriptions.

    Wider AI Copyright Battlefield

    These suits join dozens targeting AI scraping, including the Times’ ongoing case against OpenAI and Microsoft for training GPT models on copyrighted corpora. Reddit sued Perplexity alongside others for forum data misuse, while authors and artists pursue class actions. Perplexity’s selective licensing—such as with Getty Images—highlights inconsistent practices, fueling arguments that news content merits equivalent deals.

    Defenses invoke fair use, claiming transformative indexing akin to Google Search, but courts increasingly scrutinize commercial reproduction and market harm. OpenAI’s pacts with Axel Springer, News Corp, and a reported $25 million annual Times-Amazon deal signal licensing as the endgame, though startups like Perplexity resist amid cash constraints.

    Industry Stakes and Precedents

    Resolution shapes AI’s information diet: unrestricted scraping risks “content deserts” as publishers block crawlers, degrading model quality; mandatory licensing inflates costs, pressuring margins. Perplexity’s defiance—dismissing complaints as competitive pressure—tests judicial tolerance post landmark rulings like the Authors Guild vs. Google Books.

    For publishers, victories affirm intellectual property value in the AI era, potentially birthing a new revenue stream via data royalties. Consumers face tradeoffs: enriched AI responses versus preserved journalism funding. As discovery unfolds revealing scrape volumes and internals, Perplexity confronts existential scrutiny—adapt through deals or risk dismantling as the next AI copyright casualty.

    LEAVE A REPLY

    Please enter your comment!
    Please enter your name here