Project Summary

Research Summary

Humans often manage to learn the behavior of a device or computer program by just pressing buttons and observing the resulting behavior. Especially children are very good in doing this and know exactly how to use a game computer, iPod or microwave oven without ever consulting a manual. In such situations we construct a mental model of a state diagram: we determine in which global states a device can be and which state transitions and outputs occur in response to which input. This research proposal deals with the design of algorithms that will allow computers to learn complex state diagrams by providing inputs and observing outputs. The state diagrams that can be learned by current techniques have at most 30.000 states. In contrast, the state diagrams that govern the behavior of computing based systems (defined using dozens of state variables) typically have more than 101000 states. This year we obtained a breakthrough in learning large state diagrams in collaboration with prof. Jonsson from the University of Uppsala: based on some global information about how an application handles data, our algorithm learned models of some realistic communication protocols (TCP, SIP and the new biometric passport). The research objective of the ITALIA project is to further develop this technique and to construct a tool set that will allow us to learn | routinely and fully automatically | state diagrams with up to 40 state variables. Our project is unique in bringing together research on automata learning with research on machine learning, model based testing, and computer-aided verification.

Utilisation Summary

Once they have high-level models of the behavior of software components, software engineers can construct better software in less time: behavioral models can be used to simulate a system and reason about it, they allow all stakeholders to participate in the development process and to communicate with each other, they can be used to generate and test implementations, and they facilitate reuse. A key problem in practice, however, is the construction of models for existing software components, for which no or only limited documentation is available. The solution that the ITALIA project will provide is technology that will allow engineers to infer state diagrams models fully automatically through observations and test, that is, through black box reverse engineering. We expect that our technology will be particularly effective for control oriented applications such as embedded controllers and network protocols. The ITALIA project will focus on the utilisation of model inference technology within the area of testing: once we have learned a model of a software component, we will use model checking technology to analyze this model (e.g. to detect security vulnerabilities) and the technology of model based testing to automatically infer test suites. Using these test suites we can then check, for instance :

  • whether no new faults have been introduced in a modified version of the component (regression testing),
  •  whether an alternative implementation by some other vendor agrees with a reference implementation, 
  • or whether some new implementation of legacy software is correct. 
The development of our inference/learning technology will be driven by challenging case studies from a number of areas:

  • embedded systems (Axini), 
  • secure transaction systems (Collis),
  • internet protocols (NLnet Labs), 
  • printers (OcĂ©-Technologies)
  • and wireless sensor networks (Chess).
 We will fully integrate our technology with the Axini TestManager, a commercial model based testing tool, and evaluate the effectiveness of our technology by a comparison with the commercial testing platforms of Axini and Collis. Our goal is to reach the point where it becomes interesting for commercial parties, including Axini and Collis, to integrate our technology within their testing tools.