CSE Colloquium 2012-2013 Shivakumar Vaithyanathan

Declarative Analytics: Applications and Tools

Dr. Shivakumar Vaithyanathan

IBM Chief Scientist for Big Data Analytics

Time: Friday, April 12, 2013, 11:00-noon. 

Location: 3105 Engineering Building


Abstract: Modern enterprises are performing complex analyses on increasingly large data sets to drive business decisions. Tasks such as root cause analysis from system logs and social media analytics for lead generation, customer retention and digital marketing are rapidly gaining importance. These applications consist of three major analytic phases: text analytics, semi-structured data processing (joins, group-by, aggregation), and statistical/predictive modeling. The size of the data-sets in conjunction with the complexity of the analysis necessitates large-scale distributed processing of the analytical algorithms. At IBM we are building tools and technologies to support each of these analytic phases and in particular we are building declarative languages for these phases. While the declarative nature of the language abstracts away the need for programmer-optimization, the syntax of these languages is designed to appeal to the corresponding communities. As an example for statistical modeling, we expose a high-level language with syntax similar to R -- a very popular statistical processing language.
In this talk I will give an overview of some real-world big data applications we are currently working on and use that to motivate the need for the three major phases discussed above. I will then describe, in some detail, declarative systems for text analytics and statistical modeling along with a discussion on speeds, feeds and comparisons.

Speaker Bio: Shivakumar Vaithyanathan is the IBM Chief Scientist for Big Data Analytics and the Department Manager of the Large Scale Analytics and Discovery Group at the IBM Almaden Research Center. Since joining IBM in 1998, he has been involved in multiple research areas. His department is currently involved in building systems for scalable text analytics, enterprise search and large-scale machine learning. Multiple technologies developed in his department currently ship with several IBM products including IBM’s Big Data Products. Prior to IBM, Shivakumar was part of the newly formed Altavista Group at Digital. Shivakumar has co-authored more than 40 publications and was a invited keynote speaker at the 2011 German Database Conference and 2011 ACM SiGIR Industrial Track.


Host: Dr Anil Jain