Syllabus
 
CLASS INFORMATION
 
CSCI-6967 Database Systems: M-Th 2 PM - 3:20 PM, Sage 2707
Instructor: Sibel Adalı, ( 313 Lally Hall, adalis@rpi.edu )    
Office Hours: M 3:30 - 5 PM, W 4 - 6 PM
 
 
INTRODUCTION AND PREREQUISITES
This course will introduce you to the topic of information integration, i.e. constructing information services that gather information from multiple, heterogeneous systems. The emphasis in such services is the flexibility and performance. The integrated systems must provide important performance and correctness guarantees, but they must be easy to build, use and applicable to large range of applications. We will concentrate mainly on the theory of integration, challenges that must be addressed in dealing with various incompatibilities in the underlying systems. We will also discuss some new applications of information application and review some industry tools developed for this purpose. This is a graduate level course in databases and requires previous knowledge of databases, equivalent of CSCI 4380 Database Systems. Students are expected to be familiar with the following topics that would be taught in an introductory database class: relational data model, indexing, query processing and optimization in databases.
 
TEXTBOOK
There is no textbook for this class. We will cover research papers, most of them will be linked from the course web site. Note that you need to be on campus to be able to load most of the papers. When off campus, you must follow the library links for each research database.
 
Topics to be covered:    
 
  1. Introduction to information integration
  2. Introduction to logic as a database language
  3. Logical integration (LAV, GAV, GLAV)
  4. Logical issues related to integration
  5. Generating mappings
  6. Model management and schema composition
  7. Integration and ontologies
  8. Integration in P2P Systems and Workflows
  9. Applications: scientific applications (esp. bioinformatics), personal information systems, mash-up systems
  10. Existing tools
 
This list is tentative and subject to change as the semester progresses.
 
GRADING
Homeworks: There will be no exams in this course. You will have homeworks roughly every 7-10 days based on the material covered in class. Homeworks will count for 30% of your grade.
 
Presentations: Each student is expected to make two presentations in class. Each presentation will be 30-45 minutes depending on the topic and will count for 10% of your grade. You are expected to make one presentation on a research topic and one presentation on existing industry tools for specific problems. Each presentation is expected to be an in depth description of the covered material.
 
Project: Each student is expected to work on a project that either illustrates the use of the methods discussed in class for a specific problem (implementation) or solves a specific new problem (theory). Projects will count for 40% of your grade.
  1. Project description (Feb 14, 2008, 2%),
  2. Project preliminary presentations (March 27, 2008, 2%),
  3. Project final presentations (April 28, 2008, 6%),
  4. Project final report due (Last day of classes, 30%).
 
Class participation: Students are expected to read the class material in advance and participate in class discussions. Class discussion will count for 10% of your grade.
 
COURSE POLICIES
You are responsible for all the information posted in this syllabus including the course policies.
 
ACADEMIC HONESTY
You are expected to communicate to the instructor any issue regarding your performance in class ahead of time. You should be prepared to provide sufficient proof of any circumstances on which you are making a special request as outlined in the Rensselaer Handbook of Student Rights and Responsibilities. To document a valid excuse, you can get a letter from the Dean of Students. In that case, you do not have to explain to me the specific circumstances behind your absence or specific request.
 
Plagiarism, Cheating and Academic Dishonesty will not be tolerated. All your course work should provide an honest effort in solving the assigned problem by yourself (and by your group partners for group assignments). Make sure you learn how to refer to others’ work properly so that you will not risk plagiarism charges.
 
All cases of cheating will be punished and reported to the Dean of Students.
 
ATTENDANCE
You are required to attend the classes and participate in class discussions.
 
LETTER GRADES
I will use the following chart to convert your year-end average to a letter grade.
≥ 90 → A        ≥ 84 → B+     ≥ 74 → C+    ≥ 64 → D+
≥ 87 → A-     ≥ 80 → B        ≥ 70 → C        ≥ 60 → D
≥ 77 → B-    ≥ 67 → C-    < 60 → F
I reserve the right to lower these cutoff points, but I will never raise them.
 
CHANGES
There may be changes to the policies, deadlines and list of topics described in the syllabus. You can expect me to give you reasonable notice of any changes. All changes will be announced in class.
 
COURSE WEB SITE
The course website will be used to post course policies, and course material.
 
CSCI 6967 - Information Integration