Corpus Collection and Recording
Collection of speech was completed by spring 2000. Forty speakers were recruited from the Columbus, Ohio community,
all natives of Central Ohio (i.e., born in or near Columbus, or moved there no later than age 10). The sample design
is stratified for age (under thirty and over forty) and sex. Class was not strictly controlled in order to attract
participants; most speakers are middle class to upper working class.
From the 40 speakers, about 300,000 words of speech were collected, from which the corpus of aligned speech will be created.
This large sample should ensure that the estimates of the forms and frequency of phonological variation are representative
of the population under study. Furthermore, there should be a large number of tokens of many variant forms appearing in
different phonetic environments, allowing for the control of phonetic environmeent in studying variation.
Speakers were recruited using three methods: (1) advertisements in local free newspapers in four neighborhoods of Columbus
and the Ohio State University newspaper; (2) referrals from other speakers; (3) recruitment of friends and neighbors.
speakers were screened during a short phone call to make sure they are members of the target population. Potential speakers
were told that the research team is interested in how people express their opinions in conversation. Qualified speakers
came to the Ohio State University campus to have a conversation about everyday topics such as politics, sports, traffic,
schools. Use of this procedure was approved by the Internal Review Board, and no speaker expressed concern after being
debriefed on the true purpose of the study.
After a significant amount of piloting different protocols for eliciting large amounts of unmonitored speech, a
modified sociolinguistic interview format was chosen. Interviews were conducted in a small seminar room by the (male)
postdoc and (female) graduate assistant (see speech samples). To control for the possible influences of the interviewer's
sex, cells are balanced so that each interviewer meets with half of the speakers in each cell.