Research
Research Vision
The Computer Architecture and Communication Group is focusing it's research on all aspects of dependable systems. Dependability is an umbrella term for several system properties such as availability, reliability, safety, integrity, maintainability and real-time. In this context we focus on concepts, methods, algorithms and technologies in order to analyze and create dependable systems in various domains ranging from embedded real-time systems to distributed service-oriented systems and cloud computing.
Our research targets three basic principles of dependable systems. All three aspects can be covered both at design time (oflline) and run time (online). In more detail, the building blocks comprise the following:
- Data acquisition: Our research usually follows an evidence-based approach to dependable system design and analysis, focussing on data-driven algorithms and analysis techniques. Since data is the typical starting point of our activities, data acquisition strategies are a major builidng block of our work. Typical research topics are efficient runtime monitoring concepts and variable selection techniques at several architectural levels. One example approach is the interweaving of location information with standard monitoring data, in order to get better reliability properties for mobiole and wireless systems. Another example is the quantitative analysis of complex IT infrastructure management processes, which first demands data gathering through a qualitative assessment.
- Evaluation: System surveillance data needs to be evaluated in order to draw conclusions for the dependable system design. In case of an offline analysis, we apply statistical methods such as phase-type distributions or graph theory in order to determine critical paths in the hardware / software setup. In the online case, evaluation comprises statistical methods such as pattern recognition using hidden semi-Markov models or rank-sum tests.
- Action: Dependability of systems can only be improved if the analyses are followed by appropriate actions. In the offline case, this means for example to develop improved IT management processes. In the online case, we focus on prediction-triggered preventive maintenance techniques that aim to prevent an upcoming failure or try to minimize its impact. This can be achieved, e.g., by restart techniques or optimal runtime reconfiguration.
All three building blocks are accompanied by general techniques for assessment, modeling and design of systems and system components. Other examples include the assessment of service level agreements and service level objectives.
Research Topics
Our research activities in the area of dependable systems cover the following topics:
- Proactive failure avoidance, recovery and maintenance
- Adaptive and self-healing systems
- Dependability assessment of IT infrastructures (SHIP model)
- Dependability of embedded systems
- NOMADS - Network Of Mobile Adaptive Dependable Systems
- Dependability in mobile communication and ad-hoc networks
- Location-based services
- Composability and verification of software systems
- Failure prediction for hardware components and complex IT infrastructures
- Dependable multi-core computing / GPU systems
- Dependability in distributed systems and service-oriented architectures
- Stochastic modelling of dependability in complex systems
- Planning and implementation of service level agreements
Current Research Projects
- Aletheia - Semantic
federation of product information. In this project, we focus on the
system architecture, information retrieval and information location,
and presentation. Please check the Aletheia home
page for further information. Our group contributes
to the system
architecture,
information gathering and location (see also MagicMap
project), and information
display.
Contact person: Peter Ibach - SmartKanban: A Kanban
system based on intelligent, networked and ultra cost-efficient sensor
nodes
Contact person: Peter Ibach, Bratislav Milic - Magic Map: Cooperative
positioning with WLAN and RFID.
Contact person: Peter Ibach - METRIK: Graduate school on
"Model-based development of technologies for self-organizing decentral
information systems for catastrophy management"
People involved: Andreas Dittrich -
Proactive Fault Management in the
Cloud: Cloud computing provides technology for a new era of
dependable computing where redundancy in space can easily be managed
dynamically.
Contact person: Felix Salfner
Current research activities are presented and
discussed in the weekly
research seminar.
Finished Projects
- InterVal - Internet
and Value Chains (BMBF, 2003-2007)
- CERO - CE Robots Community
(Microsoft, 2003)
- DISCOURSE - The Berlin Distributed Computing Laboratory
- DFG project "QoS Management in heterogenen Rechnernetzen mit variabler Topologie"
- DFG project "Failure prediction in critical infrastructures"
- DFG project "Reliability and Performance of Service-Oriented Architectures"
- Graduate school "Stochastische Modellierung und quantitative Analyse großer Systeme in den Ingenieurwissenschaften"
- InterVal - Internet
and Value Chains (BMBF, 2003-2007)
- CERO - CE Robots Community
(Microsoft, 2003)
- DISCOURSE - The Berlin Distributed Computing Laboratory
- DFG-Projekt "QoS Management in heterogenen Rechnernetzen mit variabler Topologie"
- DFG-Projekt "Failure prediction in critical infrastructures"
- DFG-Projekt "Reliability and Performance of Service-Oriented Architectures"
- Graduiertenkolleg "Stochastische Modellierung und quantitative Analyse großer Systeme in den Ingenieurwissenschaften"
- Motherboard Failure Prediction (Intel, 2006-2008)
Finished Thesis
Publications