Forschungsseminar SoSe04
Forschungsseminar
"Neue Entwicklungen in der Bioinformatik und
Informationsintegration"
- Freitag, 9. Juli 2004, 11.15 Uhr. RUD 25, Raum IV.111 -
Duplicate-Driven Schema Matching
- Alexander Bilke
- Computergestützte Informationssysteme (CIS), TU Berlin
Graduiertenkolleg Verteilte Informationssysteme
When building a system that integrates information from disparate data sources, one has to define mappings between the sources and the global view. Schema matching is the process of finding related attributes. In this paper we describe a new schema matching approach that uses duplicate tuples to establish attribute correspondences. We show that a few duplicates can be discovered without an existing schema alignment, and describe the process of inferring correspondences from these duplicates.