MBW’s Stat Of The Week is a series in which we highlight a single data point that deserves the attention of the global music industry. Stat Of the Week is supported by Cinq Music Group, a technology-driven record label, distribution, and rights management company.
The use of artificial intelligence-created music just moved up a gear.
We’re not talking about AI in mere instrumental music production, but the use of machine learning to actually mimic and even recreate human vocals – rendering the need for a real singer obsolete.
MBW first explored this topic last March, in which we analyzed the long-term implications of HYBE’s investment into (and subsequent acquisition of) Korea-based Artificial Intelligence company Supertone – which claims that its AI tech can create “a hyper-realistic and expressive voice [not] distinguishable from real humans”.
Now, over in China, things have reached the next level: Tencent Music Entertainment (TME) says that it has created and released over 1,000 tracks containing vocals created by AI tech that mimics the human voice.
And get this: one of these tracks has already surpassed 100 million streams.
During the three months to end of September, TME rolled out what it refers to as “patented voice synthesis technology”, the Lingyin Engine. This tech, says TME, can “quickly and vividly replicate singers’ voices to produce original songs of any style and language”.
Part of TME’s initial work using the Lingyin Engine involved developing “synthetic voices in memory of legendary artists” such as the late Teresa Teng, and the late Anita Mui.
(‘Resurrecting’ the voice of a deceased star is something HYBE’s Supertone gained a lot of media attention for last year: The company used its own tech to recreate the voice of South Korean folk superstar Kim Kwang-seok.)
Cussion Pang, TME’s Executive Chairman, explained to analysts earlier today (November 15) that TME used the Lingyin Engine to “pay tribute” to Anita Mui by “creating an AI code based on her [voice]” for a new track – May You Be Treated Kindly By This World [English transation] – released in support of the New Sunshine Charity Foundation in China.
“[This track] has become the first song by an AI singer to be streamed over 100 million times across the internet.”
Cussion Pang, Tencent Music Entertainment
Teresa Teng’s voice was recreated by TME/the Lingyin Engine to lead the track Letter Not Sent [English translation], released earlier this year to mark the anniversary of the Taiwanese star’s death.
TME also confirmed today (November 15) that – in addition to “paying tribute” to the vocals of dead artists via the Lingyin Engine – it has also created “an AI singer lineup with the voices of trending [i.e currently active] stars such as Yang Chaoyue, among others”.
As mentioned, by the end of September, TME says it had created and released over 1,000 songs with human-style vocals manufactured by the Lingyin Engine.
One of those tracks has set the standard for popularity: TME’s Cussion Pang confirmed this morning to analysts that a version of one song, which appears to be called Today (English translation), “has become the first song by an AI singer to be streamed over 100 million times across the internet”.
Where could all of this go next?
For one thing, the mind inevitably wanders to the fact that over 100,000 tracks are now being uploaded to major global music streaming services every single day.
Where could that figure scale up to if limitless tracks are now being born with uncanny human-esque AI vocals?
It’s also worth remembering what Choi Hee-doo, the COO of Supertone – that’s the Korean AI voice-creation platform – said last year, when pondering how the tech might evolve.
“For example, BTS is really busy these days, and it’d be unfortunate if they can’t participate in content due to lack of time,” the exec told CNN.
“So, if BTS uses our technology when making games or audiobooks or dubbing an animation… they wouldn’t necessarily have to record [that audio live] in person.”
Interestingly, K-pop company HYBE’s biggest organic revenue driver in Q3 2022 was its Artist ‘Indirect-involvement’ business line, which sees the name and likeness of superstar artists such as BTS used in other areas like games and advertising without requiring the band’s active participation.
HYBE has now doubled down on its AI-generated voice plans, by fully acquiring Supertone in October in a $32 million deal.
Indeed, when HYBE confirmed that BTS would be enlisting in the army last month, HYBE CEO Jiwon Park, breaking down HYBE’s strategy without its top-earning act, said that the company’s newly acquired AI voice startup will “serve as a key piece of the technology sphere we aim to create”.
He added: “HYBE plans to unveil new content and services to our fans by combining our content-creation capabilities with Supertone’s AI-based speaking and singing vocal synthesis technology.”
In addition to HYBE and TME, there’s also another giant of the technology and music world that seems to be betting big on AI: TikTok and its parent company ByteDance.
Back in July 2019, ByteDance acquired Jukedeck, a UK-based AI Music startup that specialized in creating royalty-free music for user-generated online videos.
In May, ByteDance launched Mawf, a machine-learning driven music-making app that analyses incoming audio signals and then “re-renders” those signals using what it says is machine learning models of musical instruments. ByteDance also recently launched a music creation app in China called ‘Sponge Band’ according to Tech Planet.
This year, as first reported by MBW, the company has been doubling down on its AI-powered music-making ambitions via a hiring spree for AI music experts.
TikTok is specifically (and currently) hiring for a Research Scientist in Speech Synthesis in California. TikTok says that this person will “lead research to advance science and technology in Natural Language Processing and Speech Processing (e.g., Speech Synthesis, ASR)”.
They will also “research, model, design, develop and evaluate novel machine learning models and algorithms”.
It’s also hiring for a Research Scientist in Speech & Audio, in Seattle, Singapore and Mountain View.
TikTok says that this team’s focus is “on cutting-edge R&D in areas like speech & audio, music processing, natural language understanding and multimodal deep learning”.
Could TikTok – which runs its own artist distribution service SoundOn, and is reportedly preparing to expand its Resso music service into more markets – release tracks in the near future (just like Tencent Music) either fully created by AI, or featuring ‘AI synthetic voices’?
If it did, what would it mean for its relationship with the music industry?
Cinq Music Group’s repertoire has won Grammy awards, dozens of Gold and Platinum RIAA certifications, and numerous No.1 chart positions on a variety of Billboard charts. Its repertoire includes heavyweights such as Bad Bunny, Janet Jackson, Daddy Yankee, T.I., Sean Kingston, Anuel, and hundreds more.Music Business Worldwide