Wednesday, October 27, 2004, 12:00pm, EB 3105


Speaker: Douglas W. Oard, University of Maryland, College Park


Title: Using Oral History to Learn About Searching Spontaneous Conversational Speech




Recent dramatic improvements in the accuracy of automatic transcription for spontaneous conversational speech hold the promise to unlock access to vast quantities of spoken language sources. Realizing that promise requires that we develop search technology that is tuned to the nature of conversational speech. In this talk, I will describe a first effort to do just that, leveraging a large collection of "oral history" interviews for which a uniquely rich collection of associated metadata is available. I'll briefly describe the status of our work on speech recognition, topic segmentation, and text classification. I'll then focus on the process that we have used to build an information retrieval test collection, and our results from initial experiments with that collection. I'll conclude by explaining how those results are helping to guide future work on speech recognition, and our plans for building test collections for languages other than English. This is joint work with Charles University, the IBM TJ Watson Research Center, the Johns Hopkins University, the Survivors of the ShoahVisual History Foundation, and theUniversity of West Bohemia.


About the speaker:


Douglas Oard is an Associate Professor at the University of Maryland, College Park, with a joint appointment in the College of Information Studies and the Institute for Advanced Computer Studies. He holds a Ph.D. in Electrical Engineering from the University of Maryland, and his research interests center around the use of emerging technologies to support information seeking by end users. Dr. Oard's recent work has focused on cross-language information retrieval, searching spoken language collections, data mining from text, and the exchange of ratings by networked users. Additional information is available at