Mar 30, 2014

Making Earphone Presses Useful with PyAudio and VLC HTTP API

Note: I am not an audio expert or even close to one. This post may pose amateur attempts to do something very trivial. Link to the Github repository.

Ever had one of those moments, when you are super excited about accomplishing a challenge, having put something useful on the table, only to realize it is not even close to the greatness you imagined; to be bitter, futile? This weekend I build something to detect earphone button presses and control VLC media player with it but it was not so useful after all.

Earphone Presses

Samsung Earphone

(Link to original image)

I own a pair of Samsung earphones, intrigued how the buttons used to switch / pause tracks in smartphones work, I plugged in the pieces in my combo jack, used audacity, pressed a button and the result was:

Wave form for button press

“Great! Awesome find!” exclaimed my mind. So how can we make this into something useful?

Detecting earphone presses

Intuitively, the signal has a very high amplitude, so much so that, the signal gets clipped. Quite clearly, to detect the earphone press, implies detecting clipping for a certain length of time. Here is how I did it.

stream = pyaudio.PyAudio().open(format=self.FORMAT, channels=1, rate=self.RATE, input=True, frames_per_buffer=1)

    data = #read one sample
    int_sample = struct.unpack("i", data)[0] #convert string to 32 bit integer

First I used PyAudio library to process one sample at each instance. Since the sample is in ASCII format, I needed it to convert to integer for which “struct” module comes in handy.

count_min = 0
button_down = False

# Count no of clipped samples. The first condition
# makes sure double events don't fire after a long press.

if( not(button_down) and int_sample <= self.THRESHOLD_MIN ):
    count_min += 1
    if count_min >= self.THRESHOLD_SAMPLES:
        count_min = 0
        button_down = True
        if self.print_e : print "button_down"
        count_min = 0

So how do we detect clipping? By finding a streak of clipped samples. In this code, I try to find a continuous sequence of samples below a “minimum threshold”. If the count of these samples exceed some value (about 800 samples at 22kHz sampling rate), we know the signal was clipped. Same can be applied for analyzing a streak of samples above “maximum threshold”.

Hence, using the above we have detecting “button down” and “button up” events. The long hold event can also be detected with analyzing the time after “button down” event. If it exceeds, lets say, 1.5s and “button up” event is not fired, then it probably implies “button hold” event.

VLC HTTP Interface

Did you guys know, VLC can be controlled by variety of interfaces including a HTTP one? Its one of those softwares, which are much more impressive than propriety counterparts. I build a small library to communicate to VLC via the HTTP interface.

In the end, I combined all of the above, to make earphone press pause / play the media and holding the button seeking 5sec further.

But pretty useless

While making this, I didn’t realize that my laptop has a combo jack which makes it possible to do recording and playing with the same device just like smartphones. So computers without this feature are out of luck to use this which means almost all.

On the whole, it was a nice weekend hack which got me started with basics of audio and how it works. I will try to take it further next time; build a whistle detector, maybe.

Follow Me!

I write about things that I find interesting. If you're modestly geeky, chances are you'll find them too.

Subscribe to this blog via RSS Feed.

Don't have an RSS reader? Use Blogtrottr to get an email notification when I publish a new post.