Web Database Management Systems
Spring 1999
Thursday 12-1:50 pm
Carnegie 106
" ... Web content producers need tools to
rapidly and inexpensively build huge data stores and sophisticated applications.
This in turn creates a huge demand for database technology that automates
the creation, management, searching, and security of web content. Web consumers
need tools that can discover and analyze information on the Web.... " The
Asilomar Report on Database Research, by Bernstein et. al. Sigmod Record,
December 1998.
Subject:
In this course, we will examine the research
in the area of data management of web based information, i.e. hyperlinked
documents that are spread across a public domain network. We will read
and discuss research papers that are related to this area. The subjects
that will be covered are (tentatively):
- query languages and systems for Web data, SGML;
- management of Web documents, searching, updating and maintaining Web
content;
- finding structure in Web information;
- data integration over the Web;
- warehousing of Web data;
- data-intensive applications on the Web, E-commerce issues;
- transactions on the Web;
The ultimate objective of the course is to learn
about the issues and the current approaches in the area, and also discuss
the unresolved research problems in data management that need to be addressed.
Course Work:
- Homework 1, reports due March 18,
1999 and reviews due April 1, 1999. (%20 of the final grade)
- Review Form for reading reports
- Project Part 1, due March 25,
1999. (%20 of the final grade)
- Project Part 2, due April 29,
1999. (%20 of the final grade)
- In-class presentations (%30 of the final grade)
- In-class participation (%10 of the final grade)
PAPERS TO BE READ
Database Techniques
for the World-Wide Web: A Survey. Daniela Florescu, Alon Levy, and
Alberto Mendelzon.
-
Querying the World
Wide Web. A. Mendelzon, G. Mihaila, and T. Milo. International Journal
on Digital Libraries, 1(1):54-67,1997.
-
Introduction to SGML/XML.
-
The LOREL Query Language
for Semistructured Data. Serge Abiteboul, Dallan Quass, Jason McHugh,
Jennifer Widom, Janet Wiener. Journal of Digital Libraries.
-
An Object-Oriented
SGML/HYTIME Compliant Multimedia Database Management System. Tamer
Ozsu, Paul Iglinski, Duane Szafron, Sherine El-Medani, and Manuela Junghanns,
Multimedia Systems, 1997.
-
Learning to Extract
Symbolic Knowledge from the World-Wide Web. Mark Craven, Dan DiPasquo,
Dayne Freitag, Andrew McCallum, Tom Mitchell, Kamal Nigam, and Sean Slattery.
In Proceedings of the AAAI Fifteenth National Conference on Artificial
Intelligence, 1998.
-
Wrapper Induction
for Information Extraction. N. Kushmerick, R. Doorenbos, and D. Weld.
In Proceedings of the 15th International Joint Conference on Artificial
Intelligence, 1997.
Lecture Notes
Presentation Schedule (so far)
- Jan 28, 1999. Tugrul Bingol, Querying the World Wide Web
- Feb 4, 1999. John Avitabile, The LOREL Query Language for
Semistructured Data
- Feb 4, 1999. Michelle Conway, An Object-Oriented SGML/Hytime
Compliant Multimedia Database Management System.
- Feb 11, 1999. Frank McDermoot, WebOQL: Restructuring documents,
databases and webs.
- Feb 11, 1999. Jeremy Winston, To weave the Web.
- Feb 18, 1999. Sibel Adali, Queries and Computation on the Web.
- Feb 18, 1999. Shannon Pixley, Adaptive Web Sites:
an AI Challenge.
- Feb 25, 1999. Anand Paka, A Declarative Language
for Querying and Restructuring the Web.
- Feb 25, 1999. Ekta Agarwal, A Conceptual Model and a Tool
Environment for Developing More Scalable, Dynamic, and Customizable
Web Applications.
- Mar 4, 1999. John Avitabile, Catching the Boat
with Strudel: Experiences with a Web Site Management System.
- Mar 4, 1999. Eric Breime, Using Probabilistic Information in Data
Integration.
- Mar 18, 1999. Chi-nan Chiang, Efficient Crawling Through URL
Ordering.
- Mar 25, 1999. Chris Achille, Wrapper Induction for Information
Extraction.
- Mar 25, 1999. Micheal Bailey, Learning to Extract Symbolic
Knowledge from the World-Wide Web.
- Apr 1, 1999. Shannon Pixley, Optimizing Queries Across Diverse
Data Sources.
- Apr 1, 1999. William Herzog,
Integration of Heterogeneous Databases Without Common Domains Using
Queries Based on Textual Similarity.
- Apr 8, Sibel Adali, Equal time for data on the Internet with
WebSemantics.
- Apr 8, 1999, Jin Li,
Leveraging Mediator Cost Models with Heterogeneous Data Sources.
- April 16, 1999. Tugrul Bingol, Transactional Services for the Web.
- April 16, 1999. Jin Li, A Unified Algorithm
for Cache Replacement and Consistency in Web Proxy Servers.
- April 22, 1999. Chi-nan Chiang, Web Mining: Information and
Pattern Discovery on the World Wide
Web + Grouping Web Page References into Transactions for
Mining World Wide Web Browsing Patterns.
- April 22, 1999. Ekta Agarwal, "Data In Your
Face": Push Technology in Perspective.