Headline suggestion: Shadow Library Anna’s Archive Says It Backed Up Spotify — 86 Million Tracks, ~300 TB, and Torrents for Distribution Anna’s Archive — the shadow library best known for indexing pirated ebooks and academic papers — announced this weekend that it has scraped and archived a massive slice of Spotify. The group claims to have copied 86 million audio files (just under 300 terabytes) and metadata for roughly 99% of Spotify’s 256 million–track catalog, and plans to distribute the dataset via bulk torrents over the coming weeks. What they say they captured - Audio: 86 million tracks, mostly the ones people actually listen to. Popular tracks preserved in original OGG Vorbis at 160 kbps (no re‑encoding); less-listened items compressed to OGG Opus at ~75 kbps to save space. - Metadata: a database containing 186 million unique ISRCs (International Standard Recording Codes) — roughly 37× larger than MusicBrainz, the largest legal open music database (about 5 million ISRCs). - Distribution: torrents seeded across thousands of nodes; Anna’s Archive is rolling out audio files gradually and has fully released the metadata. They’ve asked users to help seed and may add individual downloads if demand is high. Spotify’s response and the legal backdrop - Spotify told Billboard that a third party “scraped public metadata and used illicit tactics to circumvent DRM to access some of the platform’s audio files.” Note Spotify has not confirmed the full scale Anna’s Archive claims. - Spotify called the group “anti‑copyright extremists” and accused it of prior piracy targeting other platforms. - Anna’s Archive already faces legal pushback in Europe: Belgium issued blocking orders with fines up to €500,000 (July 2025); the UK secured High Court blocks in December 2024; Germany’s major ISPs blocked the site’s main domains in October 2025. - Google reports it has removed 749 million Anna’s Archive URLs from search results — about 5% of all DMCA requests Google has processed since 2012. - Context: The Internet Archive once settled a high‑profile lawsuit over the Great 78 Project after publishers sought $621 million — Anna’s Archive’s haul dwarfs that project in scale and currency. Why they say they did it — and what the archive shows - Preservation claim: Anna’s Archive frames the work as cultural preservation, arguing mainstream archiving focuses on hits and high‑quality (lossless) formats and risks “obscure” music disappearing when platforms change policies or vanish. Decentralized torrent distribution, the group says, creates redundancy that can’t be shut down by a single company. - Selectivity: The group prioritized tracks using Spotify’s own popularity metric. Over 70% of Spotify’s 256 million tracks carry a popularity score of exactly zero — effectively unheard. Only about 210,000 tracks (≈0.1% of the catalog) have popularity ≥50 and account for the vast majority of listening. Archiving the entire catalog would have required roughly an additional 700 TB for material representing only about 0.04% of listening activity, Anna’s Archive says. - Snapshot findings: Anna’s Archive published analytics from the dump that highlight oddities and trends: - Track durations cluster at ~2:00, 3:00, and 4:00 minutes. - Album releases have exploded since 2015; over 10 million albums are dated 2023 alone — likely driven in part by AI‑generated and automated uploads. - Genre and artist counts: Electronic/Dance leads by artist count (520,075), followed by Rock (370,179) and World/Traditional (202,529). Opera, choral, and chamber music register high artist density per sub‑genre. - Audio features: loudness strongly correlates with energy; BPM clusters around 120; speechiness and instrumentalness are generally low (vocals dominate); C major and G major are the most common keys. About 13.5% of tracks are tagged explicit. The economic and downstream implications - Artist revenue impact: Spotify pays roughly $0.003–$0.005 per stream. Per DistroKid/Dittomusic estimates, 1 million streams ≈ $4,370 in royalties. Free torrent distribution undercuts that revenue stream. - AI and data markets: commenters on Hacker News flagged a commercial angle — Anna’s Archive has previously offered “enterprise” access to book archives for large fees, suggesting potential value to AI companies that train on large music datasets. Metadata alone is already public; audio files are rolling out. - Resilience and enforcement: Even if courts force domain takedowns or block access in jurisdictions, torrents and distributed seeds make the dataset difficult to fully eradicate — the classic decentralized file sharing problem. What it means for the crypto and web3 community - A decentralization case study: For builders in crypto and web3, this incident underscores how resilient peer‑to‑peer distribution can be for large datasets — both for legitimate preservation projects and for IP‑infringing activity. - Policy and rights management: It raises questions about how decentralized architectures intersect with copyright enforcement, artist compensation, and provenance tools (e.g., on‑chain registries for rights, tokenized licensing, or decentralized identity for creators). - Data for models: A near‑complete, labeled music dataset at this scale would be extremely valuable for ML/AI audio research and model training — and it poses clear legal and ethical risks if used without licenses. Bottom line Anna’s Archive claims to have backed up almost everything people actually listen to on Spotify — 86 million tracks and a metadata trove orders of magnitude larger than existing open databases. The group frames the move as cultural preservation; rights holders and platforms call it large‑scale piracy. The files are already moving through torrents, and with that distribution pattern the data is hard to fully retract. Expect intense legal fights, heated debates about preservation versus piracy, and renewed scrutiny on how decentralized distribution intersects with copyright and creator compensation. Read more AI-generated news on: undefined/news