
The John Joseph Moakley United States Courthouse in Boston. Photo Credit:
4300streetcar
Exactly how many tracks were used to train Suno’s generative models? The world may never know – at least if the company gets its way. Amid an ugly legal battle with the non-Warner majors, the AI music platform is fighting to conceal the figure.
As some will recognize, this is the second tooth-and-nail sub-dispute that the high-stakes lawsuit has delivered as of late. Now staring down a supersized copyright infringement and DMCA complaint, Suno is also far from eager to divulge the terms of its Warner Music licensing pact.
But as we reported, Universal Music and Sony Music themselves are attempting to secure a copy of the contract. On the other hand, those same plaintiffs are unopposed to Suno’s request to keep the training tally from becoming public information.
Rather, New York-based Inner City Press (ICP) is pushing to obtain the number – or, more specifically, to stop Suno impounding the legal docs containing the number.
“These are not peripheral discovery documents — they are the core pleadings and supporting submissions for Plaintiffs’ motion to amend the complaint,” ICP’s Matthew Lee wrote in a letter urging the court to deny Suno’s request.
From there, Lee argued that the “information is of direct public concern,” particularly for the professionals who “have a direct interest in understanding what is alleged about how their recordings were used.”
In the opposite corner, Suno in a follow-up filing framed its request as pertaining to “a single figure” – referring to “the total volume of audio files” ingested for training.
“[T]he sole information Suno seeks to impound is the total number of audio files allegedly used to train its model,” the defendant wrote. “Suno has never disclosed that figure publicly, and for good reason.
“If competitors learned the volume of Suno’s training data—a core component of the tool has [sic] Suno built—they could use that information to replicate and benchmark against Suno’s models, infer aspects of Suno’s training and development process, and potentially optimize their models to unfairly compete with Suno’s by leveraging Suno’s confidential business information,” the platform continued.
Similarly, Suno co-founder and CTO Georg Kucsko in an almost-verbatim declaration expressed the belief that rival AI developers could capitalize on the training number “to benchmark their own systems against Suno’s model, infer aspects of Suno’s training and development approach, and” more.
Now, all eyes are on the judge’s determination, which hadn’t made its way into the docket at the time of writing.
Elsewhere in the AI litigation arena, Sony Music is charging ahead with an expanded action against Udio despite the startup’s Warner Music and Universal Music licensing agreements.
Unsurprisingly, a firm training-volume total hasn’t been publicly disclosed in that suit, either. Sony Music’s review “of Udio’s training data revealed that Udio collected over [redacted] audio files to build its training dataset,” a relevant line reads.
