Skip to main content
MSU CSE Colloquium Series 2013-2014: Martin Herman Title:  Scalable and Interactive Data Analytics

Dr. Chandan Reddy
Department of Computer Science
Wayne State University

Time: March 14th 11:00am
Location: EB 3105


Big data is almost everywhere. In recent years, data acquisition has become easier and data storage has become cheaper which has led to the accumulation of large volumes of data in a wide range of applications. Analytical methods that can provide critical insights from such voluminous datasets are yet to catch up with these rapid developments. This talk consists of two parts. In the first part, I will present mapreduce-based scalable ensemble machine learning algorithms that can efficiently handle large-scale data by facilitating simultaneous participation of multiple computing nodes. I will demonstrate the superior performance of the proposed algorithms in terms of speedup and scaleup while maintaining the generalizability of the corresponding original versions. Some of the related topics such as privacy-preserving data mining and multi-task learning in the context of big data will also be discussed. In the second part of the talk, I will present a novel interactive topic modeling approach for analyzing document collections. Most of the widely-used topic modeling methods based on probabilistic models, such as Latent Dirichlet Allocation (LDA), have drawbacks in terms of consistency from multiple runs, empirical convergence, and incorporating user feedback.  To overcome these challenges, we developed a reliable and flexible visual analytics system for topic modeling called UTOPIAN (User-driven Topic modeling based on Interactive Nonnegative Matrix Factorization). I will describe the UTOPIAN system and how it enables users to interact with the topic modeling method and steer the results in a user-driven manner. I will end this talk with some of our ongoing works in healthcare and social media.


Chandan Reddy is an Associate Professor in the Department of Computer Science at Wayne State University. He received his Ph.D. from Cornell University and M.S. from Michigan State University. He is the Director of the Data Mining and Knowledge Discovery (DMKD) Laboratory and a scientific member of Karmanos Cancer Institute. His primary research interests are Data Mining and Machine Learning with applications to Healthcare Analytics, Bioinformatics and Social Network Analysis. His research is funded by the National Science Foundation, the National Institutes of Health, the Department of Transportation, and the Susan G. Komen for the Cure Foundation. He has published over 45 peer-reviewed articles in leading conferences and journals including TPAMI, TKDE, SIGKDD, ICDM, SDM, and CIKM. He received the Best Application Paper Award in ACM SIGKDD conference in 2010, and was a finalist of the INFORMS Franz Edelman Award Competition in 2011. He is a member of IEEE, ACM, and SIAM.


Dr. Pang-Ning Tan