Direkt zum Inhalt Direkt zur Suche Direkt zur Navigation

Humboldt-Universität zu Berlin - Mathematisch-Naturwissenschaftliche Fakultät II - Wissensmanagement in der Bioinformatik

Studien- und Diplomarbeiten

Studien- Bachelor-, Master- und Diplomarbeiten

Der Lehrstuhl vergibt ständig Themen in seinen Forschungsgebieten, insbesondere im den Bereichen:

  • Informationsintegration
  • Management und Analyse biomedizinischer Daten
  • Graphanfragen
  • Textdatenbanken
  • Text Mining und Informationsextraktion

Wenn Sie an einer Arbeit bei uns interessiert sind, wenden Sie sich bitte an Prof. Leser
 

Wir sind ausdrücklich offen für gemeinsame Arbeiten mit Firmen, Forschungsinstituten oder anderen Institutionen.

Aktuelle Themen

  • Master- or diploma thesis "Referential DNA database"

    This thesis will investigate the development of a database for human genome data. Although there is plenty of work on human genome databases, this project is unique in that it will store individual human DNA as referential snapshots with respect to a reference human genome.

    This strategy has several advantages (low memory footprint for individual genomes, no need to keep extended data structures for each genome in the database), but also poses several research challenges (implementation of updatable data structures for query answering).

    Disclaimer: a recognizable part of this thesis is to be spent on theoretical work first (survey about existing genome databases, investigation of possible queries to such a database, design decisions)! The thesis has to be written in english and the implementation should be either in Java or C++.

    If you are interested and would like to obtain further information, please get in touch with Sebastian Wandelt

  • Master- or diploma thesis "DNA Read compression with shingles"

    This thesis will investigate the development of a compression scheme for DNA reads. Given a (large) set of reads, the problem is too eliminate as many almost-redundant reads as possible, in order to reduce the amount of data. Naive compression can be performed by duplicate elimination. More sophisticated approaches can be focused on, e.g. 1)near-duplicate elimination, i.e. remove reads, which have a small edit distance to other reads in the data set, or 2) relative alignment of reads and referential compression. This thesis should focus on the second option.

    Disclaimer: a recognizable part of this thesis is implementation work! The thesis has to be written in english and the implementation should be either in Java or C++.

    If you are interested and would like to obtain further information, please get in touch with Sebastian Wandelt