GeoPython 2021

Audio Signal Processing for Feature Building and Machine Learning
2021-04-23, 20:30–21:00, Track 1

This talk will highlight audio signals, audio processing techniques, feature building and end to end Machine Learning examples along with the open source tools that can be leveraged in python.


Unlike types of data that are more commonly dealt with in the industry these days, such as numerical data, text or image data, audio signals need a different approach while trying to extract information and building machine learning models. This talk will highlight the challenges with Audio Classification problems starting with what an audio signal is and what its numerical representation means, how it is widely different from other data types, what feature extraction from audio looks like, how to go about it, what it means and the open source tools in Python that can be leveraged for the same. Digital signal processing, that includes audio processing, is a whole separate field to study and leveraging portions of learning from that in order to build successful models on audio data is an interesting and challenging problem. In addition, Matlab is a popular language of choice with great tools for audio signal processing. Python being a popular language of choice for Machine Learning presents another set of challenges to build successful audio and speech classification solutions in Python alone. Focus will then upon how to build classification models from the features representing the unseen information from audio and speech signals and doing it all leveraging different open source tools available to Python users. This will be followed by a few examples of different audio classification and prediction problem statements and a solution for attempting to solve them using Python using the different features formation techniques and tools discussed earlier in the talk.