Using Millions of Weblog Stories as a Knowledge Base
Institute for Creative Technologies
University of Southern California
Friday, April 24
11:00 AM - 12:00 PM
3540 Engineering Building (Please note new location.)
Host: Joyce Chai
The rise of Internet weblogs has opened a new channel for the sharing of personal stories of everyday experiences. Many of the stories in weblogs focus on the mundane: another day at the office, a bicycle trip with a friend, an argument with spouse, or a weekend spent doing household tasks. However, the remarkable scale of storytelling on the web creates new opportunities for tackling problems that require broad coverage over all of the facets of everyday life. In this talk I will describe our research on the collection and analysis of personal stories in Internet weblogs on a massive scale. I'll describe our work in creating corpora of millions of personal stories using statistical text classification techniques. I'll discuss the difficulties in analyzing the text of weblog stories, particularly with respect to the causal and temporal relationships between sentences and clauses. I'll outline a research agenda for using weblog stories as a knowledge base for automated reasoning about everyday events, and present a prototype technology that takes the form of a text-based interactive storytelling application.
Andrew Gordon is a Research Scientist and Research Associate Professor at the University of Southern California's Institute for Creative Technologies. He is the author of the 2004 book, Strategy Representation: An Analysis of Planning Knowledge. He received his Ph.D. in Computer Science from Northwestern University in 1999.