Using Scikit’s ICA to separate audio sources

Standard

ICA / Independent Component Analysis, is one of the decomposition methods that is capable of decomposing multivariate signals into components, based on the assumption that the source components are statistically independent from each other, and follow a non-gaussian distribution.

The classic application of ICA is in cocktail party problem. Say in a cocktail party, you set up two or more microphones on different locations to record two separate conversations. Each of those microphones would pick up both of the conversations simultaneously, thus you’d expect to hear conversation #1 on both microphones, etc.With ICA, we could (to certain degree) separate those recordings into 2 audio streams that isolate those conversations.

Here’s my example code on using scikit’s Fast ICA to separate audio signal: https://github.com/subokita/Sandbox/blob/master/blind_source.py

You can find a variety of audio data from http://www.ism.ac.jp/~shiro/research/blindsep.html. Those data range from straightforward linear mixture, to some of the more difficult environments.

4 thoughts on “Using Scikit’s ICA to separate audio sources

  1. Eric Krief

    Hi, I stumbled into your profile on GitHub by searching for Blind Source Separation.
    I was trying to solve the “Cocktail Party” problem from Andrew Ng’s course on Coursera.
    I am trying to solve this for quite a while and am struggling with it.
    I saw the code you wrote here https://github.com/subokita/Sandbox/blob/master/blind_source.py
    and I tried to use it but it doesn’t work for me.
    I also tried to change and modify a bit but still nothing.
    This is very important for me, is there any way you could help me with this?
    I would really appreciate any help if you are ok with that.

  2. sub

    What’s the issue, as in what doesn’t work? I haven’t touched the thing for the past 4 years, so most likely broken due to lib updates.

  3. Eric Krief

    It outputs corrupt wav files. So I decided to try and convert them to mp3 so I could open them.
    The mp3 files work, but unfortunately it did not separate the two sources at all. All it did is basically return the same mixed files but in lower volume.
    I know it’s not going to separate the two sources completely, but it’s not even close to the results Andrew Ng got in his lecture.
    When you originally coded this, did you get good results?
    What should I try next?

  4. sub

    When I originally got it, the separation was clean, but the catch is that the audio sources are of instantenous mixtures, instead of convolutional. From what I know, ICA doesn’t really work on convolutional based mixtures (and most recordings are, considering phase shifts, time delays, etc).

    So your case might just be that the data is not instantenous mixtures. But I’m wondering why it’s corrupted, could you send me your WAV files to saburo.okita (at) gmail.com ?
    Here’s a place where you can find instantenous ​​mixture: http://www.ism.ac.jp/~shiro/research.html

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s