The following are the tentative schedule of lectures in this class. It will be updated as the semester progresses.
CSCI 6967 - Information Integration
DATES
1/14/08 -
1/17/08
1/24/08
1/28/08 -
2/4/08
2/7/08 -
2/11/08
2/14/08 -
2/21/08
2/25/08 -
2/28/08
3/3/08
3/6/08
3/11/08 -
3/15/08
3/17/08 -
3/24/08
3/27/08
3/31/08
4/3/08 -
4/14/08
4/17/08 -
4/21/08
4/24/08
4/28/08
TOPIC/PAPERS
Introduction
•Mediators in the architecture of future information systems, Gio Wiederhold IEEE Computer, 1992, Volume: 25 , Issue: 3, page(s): 38 - 49 [IEEE Xplore Link]
•Beauty and the Beast: The Theory and Practice of Information Integration, Laura Haas, Proceedings of ICDT, 2007. [Link]
•From databases to dataspaces: a new abstraction for information management, Micheal Franklin, Alon Halevy, David Maier, ACM SIGMOD Record, 2005. [ACM DL Link]
Logic as a database language
•Principles of Database and Knowledge-base systems, Vol II. Jeffrey Ullman, Computer Science Press (ISBN 0-7167-8162-X), Chapters 12, 13
•Lecture notes from Jeff Ullman -> Introduction to Datalog [Link]
Answering queries using views
•Query Processing in the Information Manifold., A Levy, A Rajaraman, J Ordille., Proc. VLDB Conference (1996), [VLDB Link] [Link]
•Navigational plans for data integration., M Friedman, A Levy, T Millstein. Proc. of the 16th Nat. Conf. on Artificial Intelligence (1999), [Link]
•Data integration: a theoretical perspective. Maurizio Lenzerini. Proceedings of the twenty-first ACM SIGMOD Conference (2002), [ACM DL Link]
•Answering queries using views: A survey. Alon Halevy, VLDB Journal, 2001. [Springer Link]
Logical issues related to integration
•Answering queries using templates with binding patterns (extended abstract)., Anancl Rajaraman Yehoshua Sagiv Jeffrey D. Ullman, Proceedings of the fourteenth ACM SIGMOD Conference (1995), [ACM DL Link]
•Obtaining complete answers from incomplete databases., Alon Levy., Proc. of the 22nd Int. Conf. on Very Large Data Bases (1996), [VLDB Site]
•Tackling inconsistencies in data integration through source preferences., G De Giacomo, D Lembo, M Lenzerini, R Rosati., Proceedings of the 2004 international workshop on Information quality in information systems (2004) [ACM DL Link]
Generating mappings
•Collective entity resolution in relational data, I Bhattacharya, L Getoor., ACM Transactions on Knowledge Discovery from Data (TKDD) (2007), [ACM DL Link]
•Swoosh: A generic approach to entity resolution., O Benjelloun, H García-Molina, Q Su, J Widom., [Link]
•Schema Mapping as Query Discovery., R Miller, L Haas, M Hernández., Proceedings of the 26th International Conference on Very Large Databases (2000), [ACM DL Link]
•Reference reconciliation in complex information spaces., X Dong, A Halevy, J Madhavan., Proceedings of the 2005 ACM SIGMOD international conference (2005), [ACM DL Link]
•Exploiting relationships for object consolidation., Z Chen, D Kalashnikov, S Mehrotra., Proceedings of the 2nd international workshop on Information quality in information systems (2005), [ACM DL Link]
Model management and schema composition
•Nested mappings: schema mapping reloaded., A Fuxman, M Hernández, H Ho, R Miller, P Papotti, Proceedings of the 32nd international conference on Very Large Databases (2006), [ACM DL Link]
•Representing and querying data transformations.. Y Velegrakis, R Miller, J Mylopoulos, Proceedings of the International Conference on Data Engineering (2005), [IEEE Xplore Link]
Tools (Student presenters)
Catch up
SPRING BREAK
Query optimization issues
•Using views to generate efficient evaluation plans for queries., Foto N. Afrati, Chen Li, and Jeff Ullman, Journal of Computer and System Sciences (2007) vol. 73 (5) pp. 703-724. [Science Direct Link]
•Adapting to source properties in processing data integration queries. Zachary Ives, Alon Halevy, Daniel Weld., Proceedings of the 2004 ACM SIGMOD international conference on Management of Data [ACM DL link]
•Dynamic Query Scheduling in Data Integration Systems. Luc Bouganim, Françoise Fabret, C. Mohan, and Patrick Valduriez, in Proceedings of the 16th International Conference on Data Engineering, 2000. [IEEE Xplore Link]
•Capturing both types and constraints in data integration. Michael Benedikt, Chee-Yong Chan, Wenfei Fan, Juliana Freire, and Rajeev Rastogi., Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data. [ACM DL link] Presented by Alex Hoch
•Mediators over taxonomy-based information sources. Yannis Tzitzikas, Nicolas Spyratos, Panos Constantopoulos. The VLDB Journal 2005. [Springer Link]
•FICSR: feedback-based inconsistency resolution and query processing on misaligned data sources. Yan Qi, Selcuk Candan and M.L. Sapino, in Proceedings of the 2007 ACM SIGMOD international conference on management of data. [ACM DL Link]
•Leveraging data and structure in ontology integration. Octavian Udrea, Lise Getoor, Renée J. Miller., Proceedings of the 2007 ACM SIGMOD international conference on management of data. [ACM DL link] Presented by Greg Williams
Scientific data and workflows
•Model-Based Mediation with Domain Maps. Bertram Ludascher, Amarnath Gupta, Maryann E. Martonet. in Proceedings of ICDE Conference, 2001. [IEEE Xplore Link]
•Towards a model of provenance and user views in scientific workflows. Shirley Cohen, Sarah Cohen-Boulakia, Susan B. Davidson., In Proceedings of the Workshop on Data Integration in the Life Sciences, 2006. [Link]
Peer to peer data management
•Start making sense: The Chatty Web approach for global semantic agreements. Karl Aberer, Philippe Cudŕe-Mauroux, and Manfred Hauswirth, Web Semantics: Science, 2003. [Link]
•Mapping data in peer-to-peer systems: semantics and algorithmic issues. Anastasios Kementsietsidis, Marcelo Arenas, and Renee J. Miller, Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data. [ACM DL Link] Presented by Matt Fyffe