\documentclass{presentation}
\usepackage[latin1]{inputenc}
\usepackage{hyperref}
\usepackage{dm-math}

\newcommand{\mycol}[1]{{\bf\color{blue}#1}}
\newenvironment{mycite}[1]{\textcolor{DarkRed}{[#1]}}

\newcommand{\astree}{Astr\'ee}

\title{The {\astree} static analyzer and beyond}

\begin{document}
\maketitle

\intercalaire{Introduction}

\pfoilhead{The {\astree} static analyzer}

{\astree} \url{http://www.astree.ens.fr} is a static analyzer based on
abstract interpretation.
\begin{mylist}
\item Analyzes a subset of the C language.
\item Machine integers and floating-point numbers, not
  ``mathematical'' integers and real numbers.
\item Tuned for large-scale control/command codes, automatically
  generated from high-level specifications.
\item Precise domains for \stress{numerical computations}.
\item Detects \stress{runtime errors}.
\end{mylist}

\pfoilhead{Challenges}

Has to analyze the \stress{original source code}, not a derived ``model''.

Has to be \stress{sound} (i.e. not say ``there is no runtime error possible'' when there are)

Has to be \stress{precise} (i.e. not warn about many \emph{possible} alarms that can't happen --- \stress{false alarms})

Handle floating-point well, including digital filtering algorithms

\pfoilhead{The biggest challenge}\vspace{-1cm}

\begin{center}
\includegraphics[height=7cm]{A380_dsc04393.jpg}
\end{center}

Very large software ($\gg$ 300,000 LOC) $\Rightarrow$ efficiency questions~!\\
\stress{Commodity PC hardware} $\Rightarrow$
keep \stress{memory requirements} low\\
$\Rightarrow$ keep \stress{analysis times} low

\pfoilhead{Efficiency considerations}

\heading{False ``good idea''} For final ``certification'' of the system, only need a single pass of analysis, even if it is slow.

\heading{In reality...} You want fast analysis
\begin{mylist}
\item for debugging the analyzer
\item for using it while you develop the analyzed code
\item for debugging input specifications (i.e. bounds on the inputs)
\end{mylist}

\pfoilhead{The team}

Bruno Blanchet [CNRS] (formerly)\\
Patrick [ENS] and Radhia Cousot [CNRS]\\
Jer\^ome Feret [ENS]\\
Laurent Mauborgne [ENS]\\
Antoine Min\'e [ENS]\\
David Monniaux [CNRS]\\
Xavier Rival [ENS] (now postdoc at Berkeley)

\intercalaire{Abstract interpretation}

\pfoilhead{Basic idea}

To each program point, set of reachable memory (variable) states at this point

Over-approximate the set by some constraints

\begin{center}
\includegraphics[height=6cm]{octagon}
\end{center}

\pfoilhead{Abstract operations}

A program statement has semantics
$\semantics{$P$}: \parts{X} \rightarrow^\cup \parts{X}$ ($X$ = possible memory states).

$\gamma(x)$: $x$ abstraction (ex: shape of polyhedron), $\gamma(x)$ set of concrete memory states represented

Abstraction of a program statement:
$\abstr{\semantics{$P$}}: \abstr{X} \rightarrow \abstr{X}$

Soundness:
\stress{
$\semantics{$P$} \circ \gamma(\abstr{x}) \subseteq
\gamma \circ \abstr{\semantics{$P$}} (\abstr{x})$}

\pfoilhead{Tests}

$\semantics{$B$}$ = set of states matched by boolean expression $B$

Abstract $W \mapsto W \cap \semantics{$B$}$

Can be constructed from tests for atomic expressions and $\sqcup$

$\gamma(\abstr{a}) \cup \gamma(\abstr{b})
\sqsubseteq \gamma(a \sqcup b)$

Ex: polyhedra $\Rightarrow$ convex hull

$\semantics{if $c$ then $p_1$ else $p_2$} (x) =
\semantics{$p_1$} (x \cap \semantics{$c$}) \cup
\semantics{$p_2$} (x \cap \semantics{$c$}^C)$\\
$\abstr{\semantics{if $c$ then $p_1$ else $p_2$}} (\abstr{x}) =
\abstr{\semantics{$p_1$}} \circ \abstr{\semantics{$c$}} (\abstr{x}) \sqcup
\abstr{\semantics{$p_2$}} \circ \abstr{\semantics{$\neg c$}} (\abstr{x})$

\pfoilhead{System of equations approach}

Express the set of states at each statement as a function of the set of states at other statements.

\begin{verbatim}
  beep
A:
  foo
  bar
  if (xyz) goto B;
C:goto A;
\end{verbatim}

States at point \texttt{A}: outcome from \texttt{beep} $\cup$ states at point
\texttt{C}

\pfoilhead{What is the concrete fixed point?}

Sets of reachable states at program points:
\begin{eqnarray}
S_1 = F_1(S_1, S_2, \dots, S_n)\\
\vdots\\
S_n = F_n(S_1, S_2, \dots, S_n)\\
\end{eqnarray}
All $F_i$ functions $\omega$-continuous

Start at $\bot$ and iterate $\omega$ times to reach fixed point.

\pfoilhead{Abstract fixed point}

Replace all $F_i$ by $\abstr{F_i}$.

Accelerate convergence using a widening operator $\triangledown$

Concrete sequence $x_1,x_2,x_3\dots$ approximated by
$\abstr{x_1},\abstr{x_2},\abstr{x_3}\dots$ with
$\forall i~ x_i \subseteq \gamma(\abstr{x_i})$.

\pfoilhead{In practice}

Iterations use the block-structure of the source code\\
Saves memory: one abstract environment per nested loop\\
Trace partitioning

\intercalaire{An open system}

\pfoilhead{Simple I/O}
System takes scalar values (integers, floats) as inputs\\
Simple interval constraints known on them: input cannot go beyond [-10, 10]

Prove "worst case" behaviour

\pfoilhead{Physical feedback}

Industrial users would like user-level properties such as "in normal conditions,
FOOBAR\_X is in [-30, 30]"

At present Astr\'ee does worst case behavior, say [-1000, 1000]

Reconstructed scenarios of inputs are often physically impossible

Future work:
How about an abstract model of the physical world? (Dynamic system, real numbers)

\pfoilhead{Intelligent I/O}

Previous analyses: system does I/O through simple memory-mapped registers, aka \texttt{volatile} variables

Increasing use of intelligent I/O controllers, coprocessors executing I/O programs (industrial req.)

Example: USB controller, OHCI spec, asynchronously updates linked lists (for worklists) and does bus-mastering DMA

Only vague specification for I/O controller (OHCI spec = book in English, many details somewhat vague) $\Rightarrow$
model as highly nondeterministic routines

\pfoilhead{Asynchronous model}

One main big synchronous process\\
Some known memory-mapped I/O zones (if necessary through preanalysis)\\
Some small nondeterministic routines processed asynchronously (e.g. "perhaps process one item from the worklist")

Semantics: before each time when the main process reads/writes MMIO or DMA zones, asynchronous routines may execute an unknown number of times

Each async routine implements some atomic step of the I/O controller (e.g. "transmit one word of data", "update head worklist pointer", etc.)

\pfoilhead{Asynchronous analysis}

Pointer analysis detects interference with MMIO/DMA zones\\
Read/writes to these results in fixpoint iterations of async routines

Work in progress

Difficulty: I/O drivers are often very dirty

Comparison: Microsoft SLAM analyzes drivers, but apparently with ideal integers, does not handle many constructs, and totally ignores I/O controllers

\end{document}