SGER: DataScope: Viewing Database Contents in Multi-Resolution at Your Finger Tips

 

National Science Foundation Award Number: IIS-06-42771 (09/01/2006—08/31/2007) 

 

Contact Information

 

Jiawei Han, PI
Department of Computer Science
University of Illinois, Urbana-Champaign
201 N. Goodwin Ave. , Urbana, Illinois 61801, U.S.A.
Office: (217) 333-6903,   Fax: (217) 265-6494

E-mail: hanj at cs.uiuc.edu, URL: http://www.cs.uiuc.edu/~hanj

 

List of Supported Students and Staff

 

§         Tianyi Wu, and several B.S. students (for implementations), Department of Computer Science, University of Illinois at Urbana-Champaign

Project Award Information

  • Award Number: IIS-06-42771
  • Duration: 09/01/2006—08/31/2007
  • Title: SGER: DataScope: Viewing Database Contents in Multi-Resolution at Your Finger Tips
  • Keywords: database systems, data visualization, data warehouse, human-computer interaction, data mining applications

Project Summary

The goal of this research project is to investigate issues in the design and development of a multi-resolution data viewing system, called DataScope, which interacts with database systems based on visual comprehension and exploration of database contents in a flexible, user-friendly, multidimensional, and multi-resolution way.  The approach consists of investigation of the design principles and implementation considerations, including structures of dimensions, ordering of attribute values, layout of data values, flexible selection of layouts, attributes, and constraints, rich information display, and automated visualization interface generation and customization.  The research integrates the technologies in database systems, data warehouse, data visualization, human-computer interaction, and data mining.  It emphasizes on efficient implementation of the functions for visual exploration of database contents and scale-up of such functions for real-time usage.  The research results are to be published in research forums on database systems, data mining, and human computer interaction, and be integrated into the graduate education at UIUC.  This project is expected to have broad applications in many fields that need to explore huge amounts of data, including business, industry, government agencies, scientific research, and education. The progress of the project and the research results will be disseminated via the project Web site (http://www.cs.uiuc.edu/~hanj/projs/datascope.htm).

Publications and Products:

1.      Chao Liu, Xiangyu Zhang, Jiawei Han, Yu Zhang and Bharat K. Bhargava, “Failure Indexing: A Dynamic Slicing Based Approach”, in Proc. 2007 IEEE Int. Conf. on Software Maintenance (ICSM'07), Paris, France, Oct. 2007.

2.      Deng Cai, Xiaofei He, and Jiawei Han, “A Unified Subspace Learning Framework for Content-Based Image Retrieval”, in Proc. 2007 Int. Conf. on ACM Multimedia (ACM-MM'07), Augsburg, Germany, Sept. 2007.

3.      Tianyi Wu, Yuguo Chen and Jiawei Han, “Association Mining in Large Databases: A Re-Examination of Its Measures”, in Proc. 2007 Int. Conf. on Principles and Practice of Knowledge Discovery in Databases (PKDD'07), Warsaw, Poland, Sept. 2007.

4.      Chen Chen, Xifeng Yan, Philip S. Yu, Jiawei Han, DongQing Zhang, and Xiaohui Gu, “Towards Graph Containment Search and Indexing”, in Proc. 2007 Int. Conf. on Very Large Data Bases (VLDB'07), Vienna, Austria, Sept. 2007.

5.      Hector Gonzalez, Jiawei Han, Xiaolei Li, Margaret Myslinska, and John Paul Sondag, “Adaptive Fastest Path Computation on a Road Network: A Traffic Mining Approach”, in Proc. 2007 Int. Conf. on Very Large Data Bases (VLDB'07), Vienna, Austria, Sept. 2007.

6.      Xiaolei Li and Jiawei Han, “Mining Approximate Top-K Subspace Anomalies in Multi-Dimensional Time-Series Data”, in Proc. 2007 Int. Conf. on Very Large Data Bases (VLDB'07), Vienna, Austria, Sept. 2007.

7.      Tainyi Wu, Xiaolei Li, Dong Xin, Jiawei Han, Jacob Lee, and Ricardo Redder, “DataScope: Viewing Database Contents in Google Maps' Way”, in Proc. 2007 Int. Conf. on Very Large Data Bases (VLDB'07), Vienna, Austria, Sept. 2007 (system demo).

8.      Xiaoxin Yin, Jiawei Han, and Philip S. Yu, “Truth Discovery with Multiple Conflicting Information Providers on the Web”, in Proc. 2007 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'07), San Jose, CA, Aug. 2007.

9.      Xiaolei Li, Jiawei Han, Jae-Gil Lee, and Hector Gonzalez, “Traffic Density-based Discovery of Hot Routes in Road Networks”, in Proc. 2007 Int. Symp. on Spatial and Temporal Databases (SSTD'07), Boston, MA, July 2007.

10.  Deng Cai, Xiaofei He and Jiawei Han, “Isometric Projection”, in Proc. 2007 AAAI Conf. on Artificial Intelligence (AAAI-07), Vancouver, B. C., Canada, July 2007.

11.  Wen Jin, Anthony K.H. Tung, Martin Ester, and Jiawei Han, “On Efficient Processing of Subspace Skyline Queries on High Dimensional Data”, in Proc. 2007 Int. Conf. on Scientific and Statistical Database Management (SSDBM'07), Banff, Canada, July 2007.

12.  Deng Cai, Xiaofei He, Yuxiao Hu, Jiawei Han, and Thomas Huang, “Learning a Spatially Smooth Subspace for Face Recognition”, in Proc. 2007 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR'07), Minneapolis, MN, June 2007.

13.  Jae-Gil Lee, Jiawei Han, and Kyu-Young Whang, “Trajectory Clustering: A Partition-and-Group Framework”, in Proc. 2007 ACM SIGMOD Int. Conf. on Management of Data (SIGMOD'07), Beijing, China, June 2007.

14.  Dong Xin, Jiawei Han, and Kevin C.-C. Chang, “Progressive and Selective Merge: Computing Top-K with Ad-hoc Ranking Functions”, in Proc. 2007 ACM SIGMOD Int. Conf. on Management of Data (SIGMOD'07), Beijing, China, June 2007.

15.  Feida Zhu, Xifeng Yan, Jiawei Han, and Philip S. Yu, “gPrune: A Constraint Pushing Framework for Graph Pattern Mining”, in Proc. 2007 Pacific-Asia Conf. on Knowledge Discovery and Data Mining (PAKDD'07), Nanjing, China, May 2007. (Best Student Paper Award)

16.  Jiawei Han, Hong Cheng, Dong Xin, and Xifeng Yan, “Frequent Pattern Mining: Current Status and Future Directions”, Data Mining and Knowledge Discovery, 14(1), 2007. (Online version published on January 27, 2007, DOI 10.1007/s10618-006-0059-1 SpringerLink).

17.  Jing Gao, Wei Fan, and Jiawei Han, “A General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions”, in Proc. 2007 SIAM Int. Conf. on Data Mining (SDM'07), Minneapolis, MN, April 2007.

18.  Xiaolei Li, Jiawei Han, Sangkyum Kim, and Hector Gonzalez, “ROAM: Rule- and Motif-Based Anomaly Detection in Massive Moving Object Data Sets”, in Proc. 2007 SIAM Int. Conf. on Data Mining (SDM'07), Minneapolis, MN, April 2007.  (One of “Best of SDM’07”)

19.  Hong Cheng, Xifeng Yan, Jiawei Han, and Chih-Wei Hsu, “Discriminative Frequent Pattern Analysis for Effective Classification”, in Proc. 2007 Int. Conf. on Data Engineering (ICDE'07), Istanbul, Turkey, April 2007.

20.  Feida Zhu, Xifeng Yan, Jiawei Han, Philip S. Yu, and Hong Cheng, “Mining Colossal Frequent Patterns by Core Pattern Fusion”, in Proc. 2007 Int. Conf. on Data Engineering (ICDE'07), Istanbul, Turkey, April 2007. (Best Student Paper Award)

21.  Hector Gonzalez, Jiawei Han, and Xuehua Shen, “Cost-conscious Cleaning of Massive RFID Data Sets”, in Proc. 2007 Int. Conf. on Data Engineering (ICDE'07), Istanbul, Turkey, April 2007.

22.  Xiaoxin Yin, Jiawei Han, and Philip S. Yu, “Object Distinction: Distinguishing Objects with Identical Names by Link Analysis”, in Proc. 2007 Int. Conf. on Data Engineering (ICDE'07), Istanbul, Turkey, April 2007.

23.  Wen Jin, Martin Ester, Zengjian Hu, and Jiawei Han, “The Multi-Relational Skyline Operator”, in Proc. 2007 Int. Conf. on Data Engineering (ICDE'07), Istanbul, Turkey, April 2007.

24.  Deng Cai, Xiaofei He, Kun Zhou, Jiawei Han and Hujun Bao, “Locality Sensitive Discriminant Analysis”, in Proc. 2007 Int. Joint Conf. on Artificial Intelligence (IJCAI'07), Hyderabad, India, Jan. 2007.

  1. Xiaoxin Yin, Jiawei Han, and Philip Yu, “LinkClus: Efficient Clustering via Heterogeneous Semantic Links”, in Proc. 2006 Int. Conf. on Very Large Data Bases (VLDB'06), Seoul, Korea, Sept. 2006.
  2. Dong Xin, Chen Chen, and Jiawei Han, “Towards Robust Indexing for Ranked Queries”, in Proc. 2006 Int. Conf. on Very Large Data Bases (VLDB'06), Seoul, Korea, Sept. 2006.
  3. Dong Xin, Jiawei Han, Hong Cheng, and Xiaolei Li, “Answering Top-k Queries with Multi-Dimensional Selections: The Ranking Cube Approach”, in Proc. 2006 Int. Conf. on Very Large Data Bases (VLDB'06), Seoul, Korea, Sept. 2006.

Project Impact

 

§         Education: Parts of the new research results are used in Data Mining courses (CS512) for both undergraduate and graduate students being taught in the Department of Computer Science, the University of Illinois at Urbana-Champaign.   Moreover, the research results will be published timely in international conferences and journals and be distributed world-wide for education and research.   The new progress will also be integrated into the new edition of my data mining textbook and other research collections.

§         Collaborations: For this project we have established collaborations with IBM T.J. Watson Research Center, Microsoft Research, and NCSA (National Center of Supercomputer Applications).  Through such collaborations we expect to have access to real datasets and applications and produce more research results.

 

Current and Future Activities

 

Currently, we have been actively implementing the DataScope system and the implemented system will be tested in the DBLP and other datasets.  Also, the system is expected to be demonstrated in an international conference and be available online for users to interactively explore DBLP and several other databases.

 

Area References

1.      S. Chaudhuri and U. Dayal, “An Overview of Data Warehousing and OLAP Technology”, SIGMOD Record, 26(1): 65-74, 1997.

2.      J. Gray, S. Chaudhuri, A. Bosworth, A. Layman, D. Reichart, M. Venkatrao, F. Pellow, and H. Pirahesh. “Data cube: A relational aggregation operator generalizing group-by, cross-tab and sub-totals”. Data Mining and Knowledge Discovery, 1:29–54, 1997.

3.      S. Sarawagi, R. Agrawal, and N. Megiddo. “Discovery-driven exploration of OLAP data cubes”. In Proc. Int. Conf. of Extending Database Technology (EDBT’98), pages 168–182, Valencia, Spain, Mar. 1998.

4.      X. Li, J. Han, and H. Gonzalez. “High-dimensional OLAP: A minimal cubing approach”. In Proc. 2004 Int. Conf. Very Large Data Bases (VLDB’04), pages 528–539, Toronto, Canada, Aug. 2004.

5.      U. Fayyad, G. Grinstein, and A.Wierse (eds.). Information Visualization in Data Mining and Knowledge Discovery. Morgan Kaufmann, 2001.

6.      J. Han and M. Kamber. Data Mining: Concepts and Techniques (2nd ed.). Morgan Kaufmann, 2006.

Potential Related Projects

Project Web site URL:  http://www.cs.uiuc.edu/~hanj/projs/datascope.htm

Online software:  Online software related to this project can be downloaded at illimine.cs.uiuc.edu

Online resources:  Research publications related to this project can be downloaded at Selected Publications