Note: I am not an audio expert or even close to one. This post may pose amateur attempts to do something very trivial. Link to the Github repository.
Ever had one of those moments, when you are super excited about accomplishing a challenge, having put something useful on the table, only to realize it is not even close to the greatness you imagined; to be bitter, futile? This weekend I build something to detect earphone button presses and control VLC media player with it but it was not so useful after all.
I own a pair of Samsung earphones, intrigued how the buttons used to switch / pause tracks in smartphones work, I plugged in the pieces in my combo jack, used audacity, pressed a button and the result was:
“Great! Awesome find!” exclaimed my mind. So how can we make this into something useful?
Detecting earphone presses
Intuitively, the signal has a very high amplitude, so much so that, the signal gets clipped. Quite clearly, to detect the earphone press, implies detecting clipping for a certain length of time. Here is how I did it.
stream = pyaudio.PyAudio().open(format=self.FORMAT, channels=1, rate=self.RATE, input=True, frames_per_buffer=1) while(1): data = stream.read(1) #read one sample int_sample = struct.unpack("i", data) #convert string to 32 bit integer
First I used PyAudio library to process one sample at each instance. Since the sample is in ASCII format, I needed it to convert to integer for which “struct” module comes in handy.
count_min = 0 button_down = False # Count no of clipped samples. The first condition # makes sure double events don't fire after a long press. if( not(button_down) and int_sample <= self.THRESHOLD_MIN ): count_min += 1 if count_min >= self.THRESHOLD_SAMPLES: count_min = 0 button_down = True if self.print_e : print "button_down" else: count_min = 0
So how do we detect clipping? By finding a streak of clipped samples. In this code, I try to find a continuous sequence of samples below a “minimum threshold”. If the count of these samples exceed some value (about 800 samples at 22kHz sampling rate), we know the signal was clipped. Same can be applied for analyzing a streak of samples above “maximum threshold”.
Hence, using the above we have detecting “button down” and “button up” events. The long hold event can also be detected with analyzing the time after “button down” event. If it exceeds, lets say, 1.5s and “button up” event is not fired, then it probably implies “button hold” event.
VLC HTTP Interface
Did you guys know, VLC can be controlled by variety of interfaces including a HTTP one? Its one of those softwares, which are much more impressive than propriety counterparts. I build a small library to communicate to VLC via the HTTP interface.
In the end, I combined all of the above, to make earphone press pause / play the media and holding the button seeking 5sec further.
But pretty useless
While making this, I didn’t realize that my laptop has a combo jack which makes it possible to do recording and playing with the same device just like smartphones. So computers without this feature are out of luck to use this which means almost all.
On the whole, it was a nice weekend hack which got me started with basics of audio and how it works. I will try to take it further next time; build a whistle detector, maybe.