Understanding music discovery algorithms – How to amplify an artist’s visibility across streaming platforms

Recommendation on streaming platforms

This piece is based on the panel about Streaming & Algorithms I organized with shesaid.so France during the JIRAFE event put together by the Réseau MAP in Paris, where I interviewed Elisa Gilles, Data Scientist Manager at Deezer, and Milena Taieb, Global Head of Trade Marketing and Partnerships at Believe, about music discoverability on digital streaming platforms.

The idea
Understanding how music discovery algorithms work and including this knowledge in marketing plans can boost a song release campaign.
How it works
Algorithms can amplify momentum about a song or artist. To best leverage them, 1/ get metadata right when distributing songs to streaming platforms, so that classification is accurate; 2/ engage a community of early fans to help recommender systems understand for whom the song is the best fit.

Algorithms are at the heart of streaming services. Catalogs of modern streaming services now exceed 70M tracks, and recommendation algorithms have become essential tools that help users navigate this virtually unlimited pool of artists and songs. The most prominent examples can be found in systems powering personalized playlists like Spotify’s Discover Weekly and Release Radar, or Deezer Flow; but streaming personalization extends far beyond such discovery features. Home section layouts on most streaming platforms are personalized, and so are the search results. Algorithms are also used to pitch users similar content, determining which artists or songs are showcased next to the ones you are currently looking at. YouTube Chief Product Officer Neal Mohan shared at CES 2018 that recommendations are responsible for about 70 percent of the total time users spend on Youtube

Recommendation algorithms are now at the heart of digital music consumption, and so I could not stress this enough: to optimize artist visibility in the modern streaming landscape, it’s crucial to understand how these algorithms work.

From where do people stream?

As Milena Taieb, Global Head of Trade Marketing and Partnership at Believe, has pointed out during our interview: 68% of total streams are user-driven — streaming from their library, their own playlists or searching for their favorite albums or artists. 14% of streams are algorithmic driven, and 10% editorial driven. This is far from Youtube’s 70% algorithm-mediated consumption share, but that doesn’t make algorithms any less important. To get added to the user’s library or personal playlist, the artist needs to get discovered by said user first — and it’s editorial and algorithmic playlists that will often help get them there.

From where do people stream? Believe Digital data, 2020

The fact that most people stream from their libraries and personal playlists doesn’t mean that that is where you should concentrate all of your attention. Yes, the goal is to move the listener from “passive” streams (originating from algorithmic or editorial playlists) to “active”, user-driven streaming — but in most cases you can’t have the latter without the former. Put simply, to get user-driven streams, you need to build up algorithmic discovery first.

A side note on COVID-19: lockdown had little impact on those discovery patterns. Elisa Gilles, Data Scientist Manager at Deezer, told me that she noticed a peak of kids content and live radio consumption, while the usual peaks during commute horse evened out across the day. However, overall behavior regarding recommendations didn’t change much. Overall streaming was down to about 15 to 20% for the first few weeks, but soon returned to normal volumes.

So, what influences a song’s discoverability and its chances to be recommended?


First of all, let me explain how recommendation systems work. There are two main ways to build recommendations for a user:

  1. By content similarity — “I recommend that you listen to an emerging hip-hop artist because you listen to a lot of hip-hop”
  2. By behavioral similarity — “I recommend that you listen to Tones & I because most users who listen to the same artists you do also listen to Tones & I”

The latter is also known as “The Netflix” approach or collaborative filtering.

This graph above is an example from the music discovery team at Spotify, looking at which artists are most commonly added in playlists together, and then using these probabilities to drive recommendation.  

Most streaming platforms use a combination of both content and behavioral approaches to power their recommendation systems. However, the exact way they are able to describe music and how they analyze listening patterns, remains the “secret recipe” of each respective recommendation engine.

How to optimize for content similarity?

Content similarity is usually more important when it comes to freshly released songs that don’t have much in terms of streaming behaviour and playlist additions for the platform to analyze. This is known as the “cold start” problem — in order to overcome it, the artists are asked to fill in the initial information about their songs when they submit music to distributors (i.e. metadata): title, artist, label, main genre, secondary genre, etc. Filling these fields as accurately as possible is very important, as this data will be a basis for the initial song classification across streaming services.

Example of a single submission form on TuneCore 

That said, streaming services usually don’t rely only on the metadata alone. Broad genre tags like “Pop” or “Dance” may take on different meanings depending on the context — and so streaming platforms develop their own content analysis systems to expand on that basic data. Such tools allow them to analyze raw audio files coupled with provided metadata to assign more narrow content tags and power initial content similarity recommendation.

So, making sure the song is properly described, and that all possible data is provided — including lyrics and even label name, can come a long way when it comes to helping your music get discovered. Making sure metadata is right is Discoverability 101.

How to optimize for behavioral similarity?

As I’ve mentioned above, a behavioral similarity approach only works when there are some listens, searches, playlists additions, saves and other consumption patterns for the algorithm to analyze. But how can you leverage that to amplify the artist’s visibility across streaming platforms?

Well, the first step is to identify which artists and songs have affinity with your music. In which playlists does your song belong? Who are other artists featured in those playlists? The  chances are that users who like those artists and listen to those playlists will also dig your music. The more users who like your songs listen to other similar songs and artists, the more relevant patterns there are for the algorithm to analyze. The more patterns there are for an algorithm to analyze, the better it will get at matching your music with your potential audience.

That means, for instance, that there is next to no point in paying for random streams. They won’t help the algorithm to qualify your song and recommend it to the right users — on the contrary, they will establish fake consumption patterns that will only hurt your discoverability.

Instead, what works is:

  • getting played and added to playlists by fans who enjoy your music and your style: they will also listen to other artists similar to you, and help the algorithm understand where you belong;
  • getting on curated playlists that are focused on your style or genre.

As you can see, optimizing for editorial and algorithmic playlists works really well together. Beware, though — editorial playlists have to be focused on your genre, especially if you are an emerging artist who’s just starting to build your fan base. Getting featured in a huge editorial playlist — something like Spotify’s “New Music Friday”, for example — can be a double edge sword. Such discovery playlists blend many artists that may not have much in common, at least sonically, with your music. In a way, too much exposure that comes too soon — that is, before your music is properly qualified — can lead the algorithm to push it semi-randomly to unqualified users, which ought to get you bad skip rates and lower your song long-term potential.

Algorithms are becoming the primary source of music discovery. The latest research from MRC Data/Nielsen Music highlights that 62% of people surveyed said streaming services are among their top music discovery sources while “just” 54% named friends and family. These algorithms are not artificial though, they work by analyzing how fans listen to your music. Building an engaged and active community around your artists and their music is still the key to running a successful and sustainable music career. These fans, even if their number is small, are your biggest resource that will help you spread the word about your music and find new listeners. Beyond that, they are the ones who will help algorithms pick up on your momentum and amplify it through the recommender systems. 


Dig Deeper

If you’re curious to learn more about how you can find the right strategies (and right spaces) to promote your artists, check out the piece I wrote for Cherie Hu’s Water & Music on how to use data to market new releases, which includes a section on how to find relevant playlists to target in your pitching campaign. To dig even deeper into understanding how your music is classified, Bas Grasmayer and Carlo Kiksen put together a tutorial to learn how the Spotify AI categorises your music and check out your song audio analysis. 

Can robots write musical masterpieces?

I wanted to comment on the overall assumption we commonly see in publications that AI will never write a “critically acclaimed hit” or out-Adele Adele. 

It is usually very politically correct (and less frightening) to suggest that AI can’t make art better than humans. It’s okay let them replace automated tasks but we like to think that more “right brain” activities are not that easily replicable. We hold on to the belief that only human creations can touch someone’s heart and mind. The way we humans create music requires getting in touch with one’s own feelings and find means of expression, on top of mastering playing one or more instruments. 

The truth is, AI can write songs as well as humans can, if not better. “Beauty is in the ear of the listener”, if I may 🙂  If you think about creativity as exploring unexplored territories, mixing or creating new sounds, trying new combinations, then AI has a lot more creative juice than any human brain. It can explore more than we can, with a lot less mental barriers about what should or shouldn’t be tried or experienced. 

“Of all forms of art, music is probably the most susceptible to Big Data analysis, because both inputs and outputs lend themselves to mathematical depiction”. 

Yoah Nuval Harrari

The real argument here is more about the very definition of an artist.

I just googled it to see what’s commonly used to describe an artist. Here’s Cambridge’s definition:

  • “someone who paints, draws, or makes sculptures.
  • someone who creates things with great skill and imagination.

This definition will evolve as musicians use AI to explore, and won’t have to produce so much entirely by themselves.

Most likely, in the future, being able to produce won’t matter as much as telling a story and having a personality that people will want to follow and hear more of. Hanging out at FastForward earlier this year, we were discussing about artist careers and about what makes people becoming fans of artists. 

Depending on musical genres and audiences, it is a mix of musical skills, personality, familiarity and storytelling that creates fandom. Song quality by itself is definitely part of these requirements, but it is usually not enough to create an audience. For now, we don’t have any AI mastermind replicating both personality and songwriting. So, artists are not directly replicable per say but both types of AI do exist already. 

In the near future, unless laws banning anthropomorphism pass throughout the world, we are even bound to see the likes of Lil Miquela, fictional artists, releasing singles on Spotify. Just like real artists, these fictional artists will have whole teams behind them to manage their careers.

Will they write better songs than Adele? 

There are some evidence that AI can write beautiful masterpieces already, that I’m sharing here. I found the following study while reading Homo Deus, by Yoah Nuval Harrari, an essay about what awaits humankind in the AI era:

“David Cope has written programs that compose concertos, chorales, symphonies and operas. His first creation was named EMI (Experiments in Musical Intelligence), which specialised in imitating the style of Johann Sebastian Bach. It took seven years to create the program, but once the work was done, EMI composed 5,000 chorales à la Bach in a single day.“

“Professor (…) Larson suggested that professional pianists play three pieces one after the other: one by Bach, one by EMI , and one by Larson himself. The audience would then be asked to vote who composed which piece. Larson was convinced people would easily tell the difference between soulful human compositions, and the lifeless artefact of a machine. Cope accepted the challenge. On the appointed date, hundreds of lecturers, students and music fans assembled in the University of Oregon’s concert hall. At the end of the performance, a vote was taken. The result? The audience thought that EMI’s piece was genuine Bach, that Bach’s piece was composed by Larson, and that Larson’s piece was produced by a computer.”

When an audience is not biased, listeners can hardly tell the difference between Bach, an AI or an unkown writer.

Can an AI write a masterpiece? Yes. You may argue that AI are trained based on a given dataset (e.g. a set of songs), depriving them from free will as to what is actually produced. However, AI can be trained to learn from the best composers, exactly like a human would have various musical influences and attend masterclasses taught by virtuoses.

One fundamental difference that still remains is joy and creative flow. A machine will hardly derive as much joy out the creative process as much as we do.

Google Magenta, going forward with AI-Assisted Music Production?

Google Magenta

Two years ago, Google launched Magenta, a research project that explores the role of AI in the processes of creating art and music. I dug a bit more on where they currently stand and they already have many demos showcasing how machine learning algorithms can help artists in their creative process.

I insist on the word help. In my opinion, technologies are not created to replace artists. The goal is to enable them to explore more options, thus potentially spark more creativity.

“Music is not a “problem to be solved” by AI. Music is not a problem, period. Music is a means of self expression. (…) What AI technologies are for, then, is finding new ways to make these connections. And never before has it been this easy to search for them.” Tero Parviainen

When you write a song, usually one of the first things you pick is which instruments you and/or your band are going to play. Right from the start, creativity already hits boundaries regarding the finite number of instruments you have on hand.

That’s why today I’m sharing more about a project called Nsynth. Standing for Neural Synthesisers, Nsynth enables musicians to create new sounds by combining existing ones in a very easy way.

You can try it for yourself with their demo website here: 

Screen Shot 2018-06-26 at 11.00.24.png
Nsynth Sound Maker Demo

See that it doesn’t have to be music instruments, as you can imagine create a new sound based a pan flute and a dog 🙂

Why would you want to mix two sounds? Sure, software enables you to create your own synthesisers already, and you may as well play two instrument samples at a time.

Blending two instruments together in a new way is basically creating sounds, like a painter would create new colors by blending them on his palette. See this as new sounds on your palette.

How Nsynth works to generate sounds

NSynth is an algorithm that generates new sounds by combining the features of existing sounds. To do that, the algorithm takes different sounds as input. You teach the machine (a deep learning network) how music works by showing it examples. 

The technical challenge here is to find a mathematical model to represent a sound so that an algorithm can make computations. Once this model is built, it can be used to generate new sounds.

NSynth Autoencoder

The sound input is compressed in a vector, with an encoder capable of extracting only the fundamental characteristics of a sound, using a latent space model.  In our case, sound input is reduced in a 16-dimensional numerical vector. The latent space is the space in which data lies in the bottleneck (Z on the drawing below).  In the process, the encoder ideally distills the qualities that are common throughout both audio inputs. These qualities are then interpolated linearly to create new mathematical representations of each sound. These new representations are then decoded into new sounds, which have the acoustic qualities of both inputs.

In a simpler version:

nsynth-ae.png

To sum up, NSynth is an example of an encoder that has learned a latent space of timbre in the audio of musical notes.

Musicians can try it out on Ableton Live:

Of course, the Magenta team didn’t stop here, and I’ll be back showcasing more of their work soon!

Sources and Inspiration

Dance and your Robot will adapt the Music to you – What if Music could be Dynamic?

Most songs usually follow the same structure, alternating verses and choruses with a break to wake you up in the middle. Think about Macklemore & Ryan Lewis – Can’t Hold Us or any other pop song and you’ll easily recognize the pattern.

Instead of having music recorded and arranged the same way set it stone for ever, imagine it could adapt. Adapt to what? I am voluntarily vague since what I saw let my imagination run pretty wild. Let see what it does to yours 🙂

Last week, I’ve been invited at my sister’s research lab, Beagle (CNRS/INRIA), to meet the Evomove project team. They developed a living musical companion using artificial intelligence, that generates music on the fly according to a performer moves. Here is a performance where music is produced on the fly by the system:

Performers wear sensors on their wrists and/or their ankles, sending data streams to a move recognition AI unit, which are then analyzed to adapt music to the moves.

The team wanted to experiment with bio-inspired algorithms (I’ll explain shortly after what that is) and music proved to be a good use case. Dancers could interact with their music companion in a matter of seconds, enabling the team to apply their algorithm on the fly.

How does it work?

The Evomove system is composed of 3 units:

  • a Data Acquisition unit, sensors on performers capturing position and acceleration;
  • a Move Recognition unit, running the subspace clustering algorithm, which finds categories in incoming moves;
  • a Sound Generation unit, controlling the music generation software Ableton Live based on the move categories found.

 

Where is the bio-inspired artificial intelligence?

“Bio-inspired” means studying nature and life to improve algorithms. Just as inspiration. It doesn’t mean that bio-inspired algorithms have to exactly mimic how nature works. In this case, the team took inspiration from the evolution of microorganisms.

The idea of their approach is inspired by the concept of nutrient processing by microbiota: gut microbes pre-process complex flows of nutrients and transfer results to their host organism. Microbes perform their task autonomously without any global objective. It just so happens that their host can benefit from it. Innovation resides in this autonomous behavior, otherwise it would be like any other preprocessing/processing approach.

In the Evomove system, complex data streams from sensors are processed by the Move Recognition unit (running the evolutionary subspace clustering algorithm), just like gut microbes process nutrients, without an objective of getting any set of move categories. The AI unit behaves entirely autonomously and it can adapt to new data streams if new dancers, new sensors come along to the performance.

You could see other projects where DJs remotely control their set with moves, but here the difference is that the approach is entirely unsupervised: there are no presets, no move programmed to generate a specific sound. While dancing, performers have no idea what music their moves are going to produce at first. The algorithm discovers move categories continuously and dynamically associates sounds with categories.

How does it feel to interact with music, instead of “just” listening?

“Contrary to most software where humans acts on a system, here the user is acting in the system”.

I interviewed Claire, one of the performers. She felt that while dancing, she was sometimes controlled by the music, and some other times controlling it. For sure, she felt a real interaction and music would go on as long as she was dancing.

garanceli-essais2017-185

Take a closer look at their wrists and you’ll see sensors.

Congratulations and thanks to Guillaume Beslon, Sergio Peignier, Jonas Abernot, Christophe Rigotti and Claire Lurin for sharing this amazing experience. If interested, you’ll find more details in their paper here: https://hal.archives-ouvertes.fr/hal-01569091/document