Resources

Newsgroup

CS591HAN Advanced Seminar on Data Mining

4:00-5:00pm every Wednesday, 3403 Siebel Center

About the Course

This is a special topic course on data mining.  The course materials will consist of presentation and discussion of research papers and research project reports closely related to the topics in data mining. The students who take CS412/CS512 (Data Mining), or have good background on data mining, database systems, machine learning and/or statistics and who would like to get one unit credit are required to register and attend this course. Other students are welcome to attend. The research papers will be selected from the course supplementary materials (which mainly consists of research papers published recently on data mining and data warehousing).

Prerequisites

  • Student who are taking or took CS412 (Introduction to Data Mining), CS512 (Data Mining: Principles and Algorithms), CS446 (Machine Learning), or CS411 (Database Systems) or some related statistics courses, and with good background on data mining, statistics, machine learning, and database systems.

Reference Books

The following texts are recommended, for reference. There are numerous other books or online resources on data mining available.

1.      Jiawei Han and Micheline KamberData Mining: Concepts and Techniques 2nd ed., Morgan Kaufmann, 2006. See the book's home page for errata, course slides, and other reference materials.

2.      Soumen Chakrabarti, “Mining the Web: Statistical Analysis of Hypertext and Semi-Structured Data”, Morgan Kaufmann, 2002.

3.      R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, 2ed., Wiley-Inter-science, 2001.

4.      M. H. Dunham, Data Mining: Introductory and Advanced Topics, Prentice Hall, 2002.

5.      U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, R. Uthurusamy, Advances in Knowledge Discovery and Data Mining, The MIT Press, 1996

6.      U. Fayyad, G. Grinstein, and A. Wierse, Information Visualization in Data Mining and Knowledge Discovery, Morgan Kaufmann, 2001

7.      D. J. Hand, H. Mannila, and P. Smyth, Principles of Data Mining, MIT Press, 2001.

8.      T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer-Verlag, 2001

9.      T. M. Mitchell, Machine Learning, McGraw Hill, 1997.

10.  Pan-Ning Tan, Michael Steinbach, and Vipin Kumar, Introduction to Data Mining, Addison-Wesley, 2006.  ISBN: 0-321-32136-7

11.  S. M. Weiss and N. Indurkhya, Predictive Data Mining, Morgan Kaufmann, 1998

12.  I. H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, Morgan Kaufmann, 2nd ed., 2005, ISBN  0-12-088407-0

Major conference proceedings that will be used in class, including  ACM SIGKDD (KDD), ACM SIGMOD, VLDB, ICDM, SDM (SIAM Data Mining conference), ICDE, ICML, WWW, and other related conferences.

Course Format and Activities

This course will draw materials from the recent data mining literature. Students will study the materials and complete all the course requirements

Assignments

Students are required to hand in half-page abstract for every paper to be presented in class right after the class presentation

Examination

No exam will be given

Course Project

No course project requirement, but those who got good research ideas from the course are encouraged to continue to claim good research results

Evaluation

Students who register for this course will be evaluated based on course presentation and participation.


Jiawei Han