This is a Matlab routine to automatically predict creakiness using a model of psychoacoustic roughness.
Source code for this library is available from:
How to run the code:
To use, please load creakByRoughness.m in Matlab and specify these parameters in the code:
- A directory where the WAV recordings are located
- A directory where you want the roughness temporal profiles to be stored
- Whether to create plots or not
- Whether to use Praat textgrids
In the latter case, you should also specify the name of the tier and the label that you’re interested (an asterisk ‘*’ will match all the labels within the desired tier).
This software should work with Matlab version >= 18.104.22.1681655 (R2016b).
This software is built upon several Matlab libraries included as subdirectories in the project:
- Covarep (for vocalic activity determination)
- Mview (for reading Praat TextGrid files)
- Roughness model implementation by Schrader & Hermes
This program receives as arguments a directory where the WAV files are located, a destination folder, a summary file name, and switches for whether to plot the spectrograms and roughness traces, and whether to use automatic vocalic segmentation, Praat text-grids, or no segmentation.
The program outputs a summary file. This file comprises a table with as many rows as WAV files in the input directory, with file name, token duration, and number of creakiness candidates found in the token.
Roughness temporal profiles are stored in the destination folder as Comma Separated Value (CSV) files with each row comprising:
- Measurement time (beginning of the frame),
- Probability of voicing,
- Duration of the vocalic segment,
- Roughness in aspers, and
- Whether that token is considered creaky or not.
A Recursive Neural Network implementation is currently being implemented.
Examples of Burmese words:
Burmese language uses four tones (checked, creaky, high, and low), two of which are considered creaky (checked and creaky). In the table below, audio examples of these tones along with spectrograms and roughness traces of their vocalic parts are presented:Spectrograms and roughness profiles for the vocalic parts of the selected tokens. Solid symbols indicate a creaky frame as determined by the program.
Du’an Zhuang corpus:
129 tokens with Textgrids and creakiness classifications:
Du’an Zhuang Corpus
How to cite:
Julián Villegas, Konstantin Markov, Jeremy Perkins, and Seunghun J. Lee, “Automatic prediction of creaky voice with psychoacoustic roughness,” JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, VOL. XX, NO. X, MONTH 2019,(SUBMITTED)