Scalability Techniques for An Emerging Breed of Applications using Intermittently Synchronized Databases
College of Computing
Georgia Institute of Technology
Thursday, February 10
Talk: 4-5:00 p.m.; Reception: 5-5:30 p.m.
Room 2243, Engineering Building
Host: Charles Owen
Abstract: In this work we consider an environment where one or more servers carry databases that are of interest to a community of clients. The clients are only intermittently connected to the server for brief periods of time. Clients carry a part of the database for their own processing and accumulate local updates while being disconnected. We call this the "intermittently synchronized data base (ISDB) environment. ISDBs have very wide applications in sales force automation, insurance claim processing and a number of applications with mobile work groups. Our focus in the paper is on the problem of update propagation at the server in ISDBs and the associated processing at the clients.
The typical client-based approach involves the communication and processing of updates and transactions on a per client basis. The complexity of this approach is on the order of the number of connecting clients, thereby limiting the ultimate scalability of the server with an increasing number of clients. We propose a clustering approach called grouping which aggregates data fragments into "data-groups" and assigns to each client one or more of these groups. A single data fragment may be assigned to multiple data-groups as well. The proposed scheme results in server processing complexity on the order of the number of groups. Clients receive updates for the groups they belong to, and filter out the irrelevant data. In this talk we propose various techniques for aggregation and discuss the processing required at the clients to enable the aggregation approach. The per-client approach is expected to degrade with the increasing number of clients. In contrast, we expect that a properly designed aggregation scheme will sustain a number of clients that is an order of magnitude larger. The talk will propose a heuristic approach as well as a mathematical model to deal with the problem of data grouping. We have also implemented a multicasting protocol to go with this approach for im proved performance. A number of research issues remain to be investigated in this important segment of future applications. A prototype has been developed on top of a beta-version of an industry product. Performance studies are currently in progress.
Biography: Shamkant Navathe is a professor at the College of Computing, Georgia Institute of Technology, Atlanta. He is well-known for his work on database modeling, database conversion, database design, distributed database allocation, and database integration. He has worked with IBM and Siemens in their research divisions and has been a consultant to various companies including Digital,CCA, HP and Equifax. He was the General Co-chairman of the 1996 International VLDB (Very Large Data Base) conference in Bombay, India. He was also program co-chair of SIGMOD 1985 and General Co-chair of the IFIP WG 2.6 Data Semantics Workshop in 1995. He is an associate editor of ACM Computing Surveys, and IEEE Transactions on Knowledge and Data Engineering. He is also on the editorial boards of Information Systems (Pergamon Press) and Distributed and Parallel Databases (Kluwer Academic Publishers). He is an author of the book, Fundamentals of Database Systems, with R. Elmasri (Addison Wesley, Edition 3, 2000) which is currently the leading database text-book worldwide. He also co-authored the book "Conceptual Design: An Entity Relationship Approach" (Addison wesley , 1992) with Carlo Batini and Stefano Ceri. His current research interests include object-oriented and multimedia databases, human genome data management,intelligent information retrieval, and mobile disconnected databases. Navathe holds a Ph.D. from the University of Michigan and has over 100 refereed publications.