# From Traces To (Formal) Models

### fttfm-slides

[PhD] From Traces To (Formal) Models

On Github willdurand / fttfm-slides

# From Traces To (Formal) Models

William Durand - February 13, 2014

• PhD student at Michelin / LIMOS
• Graduated from IUT and ISIMA
• I Open Source

## PhD Topic

Automated Test Generation for applications and production machines in a Model-based Testing approach.

## Agenda

• Introduction
• Model-based Testing
• Current Research
• Conclusion

## Software Testing

Software testing is the process of analyzing a software item to detect the differences between existing and required conditions (that is, bugs) and to evaluate the features of the software item.

It is a Verification and Validation process.

Validation    → "Are we building the right software?" Verification → "Are we building the software right?"

## Why?

• To find faults (G. Myers, The Art of Software Testing)
• To provide confidence of reliability, correctness,and absence of particular faults

This does not mean that the software is completely free of defects. Rather, it must be good enough for its intended use.

## Industry

Unit Testing, Integration Testing, Functional Testing, System Testing, Stress Testing, Performance Testing, Usability Testing, Acceptance Testing, Regression Testing, Beta Testing, <Whatever You Want> Testing

People now understand the need for testing things

They mostly do testing by hand...

One approach is to...

## Definition

Model-based Testing (MbT) is application of Model-based design for designing and optionally also executing artifactsto perform software testing.

Models can be used to represent the desired behaviorof an SUT, or to represent testing strategies and a test environement.

## Why?

• To bring the benefits of automation to new parts of the test cycle (test cases creation for instance)
• To provide testers more effective tools
• To reduce cost and cycle time
• To leverage formal methods

## Models

A Model is a description of a system that helps you understand and predict its behavior. It does not need to completely describe it to be effective.

Behavior/Control oriented: Finite Automata (FSM, LTS), Petri Nets, Synchronous Languages (Lustre, Scade) Data oriented (pre/post): JML, Spec#, OCL, B-Method, Praspel

## Three Stages

Formally modelling the requirements (specification); Generating test cases from the model; Running these test cases against an actual SUT and evaluating the results.

Combining 2. and 3. leads to On-The-Fly Testing.

## Test Generation

• Based on Finite State Machines
• Based on Symbolic Transition Systems
• Based on Labelled Transition Systems

## Labelled Transition Systems

L = (S, Act, \rightarrow) with S = \lbrace s_{1}, s_{2}, s_{3}, s_{4} \rbrace and Act = \lbrace COFFEE, TEA, BUTTON \rbrace

## Traces

\begin{align} traces(s_{3}) = & \lbrace \\ & BUTTON, \\ & BUTTON \cdot TEA \cdot BUTTON, \\ & \dots \rbrace = traces(s_{1}) = traces(s_{4}) \\ \\ traces(s_{2}) = & \lbrace \\ & TEA, \\ & COFFEE, \\ & TEA \cdot BUTTON \cdot TEA, \\ & \dots \rbrace \end{align}

## Input/Output LTS

\begin{align} Act_{I} = & \lbrace BUTTON?, COFFEE?, TEA? \rbrace \\ Act_{U} = & \lbrace COFFEE!, TEA! \rbrace \end{align}

## Automated Generation Of Specification Models

• By leveraging the API documentation
• By instrumenting the code (tracing)
• By leveraging the logs
• By monitoring the system

## Challenge

Based on a software, running in aproduction environment, would it possible to:

extract a knowledge base that can be formalized by a model that can be used to generate tests and/or specifications?

## Context (1/2)

Michelin relies on a method close to the Computer Integrated Manufacturing (CIM) approach to control its production:

• L3: Virtual level as it is not that used (Factory Management)
• L2: Supervision / Workshop Management
• L1: Automata

These levels can exchange data among them.

## Context (2/2)

Focus on Level 2 applications but, then again, there are a lot of differences between them, such as:

• Programming Language
• Framework
• Design
• Version

## Hypotheses

Applications deployed in production behave as expected Don't consider (existing) specifications

## Traces

A sequence of observable actions produced by an application.

Raw traces are collected from various sources, including the production environment thanks to a monitor. Traces in the context of web applications are HTTP requests and responses.

## Domain Expert

A (human) domain expert can deduce the meaning of an application execution by reading these traces.

What about doing the same, programmatically?

## Expert System

In Artificial Intelligence, an expert system is a computer system that emulates the decision-making ability of a human expert. Designed to solve complex problems by reasoning about knowledge, and not by following the procedure of a developer as is the case in conventional programming. Java JBoss Drools Expert engine is a powerful expert system.

## Rules

First-Order Predicate Calculus

IF condition
THEN
action
ENDIF

## Layer 1 - Filtering

Cleans up the trace set given as input, removing noise,a.k.a. irrelevant traces. The resulting structured trace set is given to the next layer. In the context of web applications, HTTP requests/responses related to assets (CSS files, JavaScript files, images) are meaningless.

## Layer 2 - IOSTS Transformation

Based on a the previous structured trace set, this layer performs a IOSTS transformation, by translating valued actions of a trace into IOSTS transitions. This first model is re-generated each time new traces are received, and then passed to the next layer.

## Layer 2 - Minimisation

The previous IOSTS is reduced in term of location size by applying a bisimulation minimisation technique.

GET("https://github.com/")
POST("https://github.com/session")
GET("https://github.com/")
GET("https://github.com/willdurand")
GET("https://github.com/willdurand/Geocoder")
POST("https://github.com/logout")
GET("https://github.com/")

## Layer 3 to N - Abstraction

Composed of rules that emulate the ability of a human expert to simplify transitions, to analyze transition syntax in order to deduce more meaningful information related to the targeted application, and to construct more abstract models. Each layer takes an IOSTS given by the direct lower one. It represents the current base of facts. Layer 3 contains low level, generic rules that can be reused against different applications.

## Layer 3 - Example

rule "Identify Login Page"
when
$t: Transition( Action == GET, Guard.response.content contains('login-form') ) then modify ($t) { Assign.add("isLoginPage := true") }
end

## The Explorer

The more traces, the better model in term of expressiveness. A robot explorer is used to increase the amount of tracesthat are sent to the model generator (i.e. the component that embeds the Expert System). It can only be applied on event-driven applications.

## Strategies

Our robot explorer uses intelligent crawling guided by strategies, rather than blind exploration. Strategies are defined by rules that define how to find new states to visit. This is an orthogonal layer. BFS and DFS are common strategies to explore a web application.

## Automatic Funktional Testing Tool

Written in Java, PHP, Node.JS, and JavaScript.

Distributed system thanks to RabbitMQ.

Service Oriented Architecture FTW!

This tool has been built for web applications. Michelin will get its own internal tool.

## So, What?

The final model gathers useful information which allow to:

generate test cases; detect potential issues; perform security testing.

## Conclusion

Once rules are written, it is quite easy to construct models. However, rules tend to be hard to write, and to maintain.