pyAudioAnalysis alternatives and similar packages
Based on the "Audio" category.
Alternatively, view pyAudioAnalysis alternatives based on common mentions on social networks and blogs.
-
SpeechRecognition
Speech recognition module for Python, supporting several engines and APIs, online and offline. -
Essentia
C++ library for audio and music analysis, description and synthesis, including Python bindings -
Watson Developer Cloud Python SDK
:snake: Client library to use the IBM Watson services in Python and available in pip as watson-developer-cloud -
aeneas
aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment) -
speechpy
:speech_balloon: SpeechPy - A Library for Speech Processing and Recognition: http://speechpy.readthedocs.io/en/latest/ -
praatIO
A python library for working with praat, textgrids, time aligned audio transcripts, and audio files. It is primarily used for extracting features from and making manipulations on audio files given hierarchical time-aligned transcriptions (utterance > word > syllable > phone, etc). -
speech-to-text-websockets-python
DISCONTINUED. Python client that interacts with the IBM Watson Speech To Text service through its WebSockets interface -
ProMo
Prososdy Morph: A python library for manipulating pitch and duration in an algorithmic way, for resynthesizing speech. -
pysle
Python interface to ISLEX, an English IPA pronunciation dictionary with syllable and stress marking.
CodeRabbit: AI Code Reviews for Developers
* Code Quality Rankings and insights are calculated and provided by Lumnify.
They vary from L1 to L5 with "L5" being the highest.
Do you think we are missing an alternative of pyAudioAnalysis or a related project?
README
A Python library for audio feature extraction, classification, segmentation and applications
This is general info. Click here for the complete wiki and here for a more generic intro to audio data handling
News
- [2022-01-01] If you are not interested in training audio models from your own data, you can check the Deep Audio API, were you can directly send audio data and receive predictions with regards to the respective audio content (speech vs silence, musical genre, speaker gender, etc).
- [2021-08-06] deep-audio-features deep audio classification and feature extraction using CNNs and Pytorch
- Check out paura a Python script for realtime recording and analysis of audio data
General
pyAudioAnalysis is a Python library covering a wide range of audio analysis tasks. Through pyAudioAnalysis you can:
- Extract audio features and representations (e.g. mfccs, spectrogram, chromagram)
- Train, parameter tune and evaluate classifiers of audio segments
- Classify unknown sounds
- Detect audio events and exclude silence periods from long recordings
- Perform supervised segmentation (joint segmentation - classification)
- Perform unsupervised segmentation (e.g. speaker diarization) and extract audio thumbnails
- Train and use audio regression models (example application: emotion recognition)
- Apply dimensionality reduction to visualize audio data and content similarities
Installation
- Clone the source of this library:
git clone https://github.com/tyiannak/pyAudioAnalysis.git
- Install dependencies:
pip install -r ./requirements.txt
- Install using pip:
pip install -e .
An audio classification example
More examples and detailed tutorials can be found at the wiki
pyAudioAnalysis provides easy-to-call wrappers to execute audio analysis tasks. Eg, this code first trains an audio segment classifier, given a set of WAV files stored in folders (each folder representing a different class) and then the trained classifier is used to classify an unknown audio WAV file
from pyAudioAnalysis import audioTrainTest as aT
aT.extract_features_and_train(["classifierData/music","classifierData/speech"], 1.0, 1.0, aT.shortTermWindow, aT.shortTermStep, "svm", "svmSMtemp", False)
aT.file_classification("data/doremi.wav", "svmSMtemp","svm")
Result: (0.0, array([ 0.90156761, 0.09843239]), ['music', 'speech'])
In addition, command-line support is provided for all functionalities. E.g. the following command extracts the spectrogram of an audio signal stored in a WAV file: python audioAnalysis.py fileSpectrogram -i data/doremi.wav
Further reading
Apart from this README file, to bettern understand how to use this library one should read the following:
- Audio Handling Basics: Process Audio Files In Command-Line or Python, if you want to learn how to handle audio files from command line, and some basic programming on audio signal processing. Start with that if you don't know anything about audio.
- Intro to Audio Analysis: Recognizing Sounds Using Machine Learning This goes a bit deeper than the previous article, by providing a complete intro to theory and practice of audio feature extraction, classification and segmentation (includes many Python examples).
- The library's wiki
- How to Use Machine Learning to Color Your Lighting Based on Music Mood. An interesting use-case of using this lib to train a real-time music mood estimator.
- A more general and theoretic description of the adopted methods (along with several experiments on particular use-cases) is presented in this publication. Please use the following citation when citing pyAudioAnalysis in your research work:
python @article{giannakopoulos2015pyaudioanalysis, title={pyAudioAnalysis: An Open-Source Python Library for Audio Signal Analysis}, author={Giannakopoulos, Theodoros}, journal={PloS one}, volume={10}, number={12}, year={2015}, publisher={Public Library of Science} }
For Matlab-related audio analysis material check this book.
Author
Theodoros Giannakopoulos, Principal Researcher of Multimodal Machine Learning at the Multimedia Analysis Group of the Computational Intelligence Lab (MagCIL) of the Institute of Informatics and Telecommunications, of the National Center for Scientific Research "Demokritos"