Graphs everywhere: Novel methods for summarization and natural language processing
Host: J. Chai
Abstract: Graph-based representations turn out to be a very helpful tool for natural language processing and machine learning. Some recent successes include Brin and Page's Pagerank method for ranking Web pages and Zhu, Ghahramani, and Lafferty's semi-supervised learning methods using harmonic functions. In this talk, I will present a framework (and two demos) for natural language processing using random walks on graphs. The first part introduces the concept of lexical centrality, based on random walks on lexical similarity graphs. Lexical centrality is used to find the most important passages in a collection of textual documents. The second part will discuss some work in progress on semi-supervised learning with binary features with applications to natural language problems such as parsing. In both cases I will show state of the art results on competitive challenges. I will also show two publicly available demos that illustrate the concepts of the talk.
Biography: Dragomir R. Radev
is an Associate Professor of Information, Electrical Engineering and Computer
Science, and Linguistics at the
Dr. Radev's current research on probabilistic and link-based methods for exploiting very large textual repositories, graph-based methods for natural language processing, representing and acquiring knowledge of genome regulation, and semantic entity and relation extraction from Web-scale text document collections is supported by NSF and NIH. He serves on the HLT-NAACL advisory committee, was recently reelected as treasurer of NAACL, is a member of the editorial boards of JAIR and Information Retrieval, and is a four-time finalist at the ACM programming finals (as contestant in 1993 and as coach in 1995-1997).
received a graduate teaching award at