Humboldt-Universität zu Berlin - Mathematisch-Naturwissenschaftliche Fakultät - Wissensmanagement in der Bioinformatik

Forschungsseminar SoSe04

Forschungsseminar
"Neue Entwicklungen in der Bioinformatik und Informationsintegration"

- Freitag, 9. Juli 2004, 11.15 Uhr. RUD 25, Raum IV.111 -


Duplicate-Driven Schema Matching

Alexander Bilke
Computergestützte Informationssysteme (CIS), TU Berlin
Graduiertenkolleg Verteilte Informationssysteme

When building a system that integrates information from disparate data sources, one has to define mappings between the sources and the global view. Schema matching is the process of finding related attributes. In this paper we describe a new schema matching approach that uses duplicate tuples to establish attribute correspondences. We show that a few duplicates can be discovered without an existing schema alignment, and describe the process of inferring correspondences from these duplicates.