Buckeye Corpus

Register | Login

Buckeye Corpus

Ohio State University logo



SpeechSearcher, a user-friendly tool for searching the corpus and quickly viewing and listening to tokens from the results of the search, is now available through the Buckeye Speech Corpus page (i.e., you must login to obtain it). It has powerful searching capabilities. Researchers and course instructors should find it particularly appealing. The manual is provided here.

Python classes for the Buckeye Corpus

This repository has Python classes for working with the Buckeye Corpus. The classes facilitate iterating through the corpus annotations. They provide cross-references between the .words, .phones, and .log annotations, and can be used to extract sound clips from the .wav files. The docstrings in buckeye.py and containers.py describe how to use the classes in more detail. https://github.com/scjs/buckeye

Other speech analysis tools

Three other speech analysis tools can be used with the corpus to view the speech files and their corresponding time-aligned label files: Xwaves, Wavesurfer, and Praat. You will want the following configuration file (BuckeyeCorpus.conf) if you use Wavesurfer. Download it into the Wavesurfer configuration directory (consult Wavesurfer help if you do not know where it is located on your computer), and then select it when prompted for a configuration when opening a speech file. If the label files are in the same directory, they will be loaded automatically. BuckeyeCorpus.conf also contains the entire set of labels.

buckeye leaf ©2005 Department of Psychology