LLL'05 Challenge: Genic Interaction Extraction with Alignments and Finite State Automata
LLL'05 Challenge: Genic Interaction Extraction with Alignments and Finite State Automata
Jörg Hakenberg1*, Conrad Plake1, Ulf Leser1, Harald Kirsch2, and Dietrich Rebholz-Schuhmann2
1 Humboldt-Universität zu Berlin, Department of
Computer Science, Knowledge Management Group, Unter den Linden 6, 10099
Berlin, Germany.
2 European Bioinformatics Institute, Rebholz-Group, Hinxton
CB10 1SD, United Kingdom.
* Corresponding author. Current affiliation: Knowledge
Management in Bioinformatics, Dept. Computer Science,
Humboldt-Universität zu Berlin, Rudower Chaussee 25, 12489 Berlin,
Germany. Phone: +49.30.2093.3903, eMail:
hakenberg(a)informatik.hu-berlin.de
Abstract
We present a system for the identification of syntax patterns describing interactions between genes and proteins in scientific text. The system uses sequence alignments applied to sentences annotated with interactions and syntactical information (part-of-speech), as well as finite state automata optimized with a genetic algorithm. Both methods identified syntactical patterns that are generalizations of textual representations of agent-target relations. We match the generated patterns against arbitrary text to extract interactions and their respective partners. Our best system uses finite state automata optimized with a genetic algorithm, and scored an F1-measure of 51.8% on the LLL'05 evaluation set.
Published in
Proceedings of the Learning Language in Logic Workshop (LLL05) at the
22nd ICML 2005, pp. 38-45. Bonn, Germany, August 2005.
[LLL'05] - [LLL'05 Challenge] - [ICML 2005]
@InProceedings{Hakenberg:2005c, author = {J\"org Hakenberg and Conrad Plake and Ulf Leser and Harald Kirsch and Dietrich Rebholz-Schuhmann}, title = {LLL'05 Challenge: Genic Interaction Extraction with Alignments and Finite State Automata}, booktitle = {Proc Learning Language in Logic Workshop (LLL05) at the 22nd Int Conf on Machine Learning}, address = {Bonn, Germany}, month = {August}, year = 2005, pages = {38-45} }