\documentclass{presentation}
\usepackage[latin1]{inputenc}
\usepackage{hyperref}
\usepackage{dm-math}
\newcommand{\mycol}[1]{{\bf\color{blue}#1}}
\newenvironment{mycite}[1]{\textcolor{DarkRed}{[#1]}}
\newcommand{\astree}{Astr\'ee}
\title{The {\astree} static analyzer and beyond}
\begin{document}
\maketitle
\intercalaire{Introduction}
\pfoilhead{The {\astree} static analyzer}
{\astree} \url{http://www.astree.ens.fr} is a static analyzer based on
abstract interpretation.
\begin{mylist}
\item Analyzes a subset of the C language.
\item Machine integers and floating-point numbers, not
``mathematical'' integers and real numbers.
\item Tuned for large-scale control/command codes, automatically
generated from high-level specifications.
\item Precise domains for \stress{numerical computations}.
\item Detects \stress{runtime errors}.
\end{mylist}
\pfoilhead{Challenges}
Has to analyze the \stress{original source code}, not a derived ``model''.
Has to be \stress{sound} (i.e. not say ``there is no runtime error possible'' when there are)
Has to be \stress{precise} (i.e. not warn about many \emph{possible} alarms that can't happen --- \stress{false alarms})
Handle floating-point well, including digital filtering algorithms
\pfoilhead{The biggest challenge}\vspace{-1cm}
\begin{center}
\includegraphics[height=7cm]{A380_dsc04393.jpg}
\end{center}
Very large software ($\gg$ 300,000 LOC) $\Rightarrow$ efficiency questions~!\\
\stress{Commodity PC hardware} $\Rightarrow$
keep \stress{memory requirements} low\\
$\Rightarrow$ keep \stress{analysis times} low
\pfoilhead{Efficiency considerations}
\heading{False ``good idea''} For final ``certification'' of the system, only need a single pass of analysis, even if it is slow.
\heading{In reality...} You want fast analysis
\begin{mylist}
\item for debugging the analyzer
\item for using it while you develop the analyzed code
\item for debugging input specifications (i.e. bounds on the inputs)
\end{mylist}
\pfoilhead{The team}
Bruno Blanchet [CNRS] (formerly)\\
Patrick [ENS] and Radhia Cousot [CNRS]\\
Jer\^ome Feret [ENS]\\
Laurent Mauborgne [ENS]\\
Antoine Min\'e [ENS]\\
David Monniaux [CNRS]\\
Xavier Rival [ENS] (now postdoc at Berkeley)
\intercalaire{Abstract interpretation}
\pfoilhead{Basic idea}
To each program point, set of reachable memory (variable) states at this point
Over-approximate the set by some constraints
\begin{center}
\includegraphics[height=6cm]{octagon}
\end{center}
\pfoilhead{Abstract operations}
A program statement has semantics
$\semantics{$P$}: \parts{X} \rightarrow^\cup \parts{X}$ ($X$ = possible memory states).
$\gamma(x)$: $x$ abstraction (ex: shape of polyhedron), $\gamma(x)$ set of concrete memory states represented
Abstraction of a program statement:
$\abstr{\semantics{$P$}}: \abstr{X} \rightarrow \abstr{X}$
Soundness:
\stress{
$\semantics{$P$} \circ \gamma(\abstr{x}) \subseteq
\gamma \circ \abstr{\semantics{$P$}} (\abstr{x})$}
\pfoilhead{Tests}
$\semantics{$B$}$ = set of states matched by boolean expression $B$
Abstract $W \mapsto W \cap \semantics{$B$}$
Can be constructed from tests for atomic expressions and $\sqcup$
$\gamma(\abstr{a}) \cup \gamma(\abstr{b})
\sqsubseteq \gamma(a \sqcup b)$
Ex: polyhedra $\Rightarrow$ convex hull
$\semantics{if $c$ then $p_1$ else $p_2$} (x) =
\semantics{$p_1$} (x \cap \semantics{$c$}) \cup
\semantics{$p_2$} (x \cap \semantics{$c$}^C)$\\
$\abstr{\semantics{if $c$ then $p_1$ else $p_2$}} (\abstr{x}) =
\abstr{\semantics{$p_1$}} \circ \abstr{\semantics{$c$}} (\abstr{x}) \sqcup
\abstr{\semantics{$p_2$}} \circ \abstr{\semantics{$\neg c$}} (\abstr{x})$
\pfoilhead{System of equations approach}
Express the set of states at each statement as a function of the set of states at other statements.
\begin{verbatim}
beep
A:
foo
bar
if (xyz) goto B;
C:goto A;
\end{verbatim}
States at point \texttt{A}: outcome from \texttt{beep} $\cup$ states at point
\texttt{C}
\pfoilhead{What is the concrete fixed point?}
Sets of reachable states at program points:
\begin{eqnarray}
S_1 = F_1(S_1, S_2, \dots, S_n)\\
\vdots\\
S_n = F_n(S_1, S_2, \dots, S_n)\\
\end{eqnarray}
All $F_i$ functions $\omega$-continuous
Start at $\bot$ and iterate $\omega$ times to reach fixed point.
\pfoilhead{Abstract fixed point}
Replace all $F_i$ by $\abstr{F_i}$.
Accelerate convergence using a widening operator $\triangledown$
Concrete sequence $x_1,x_2,x_3\dots$ approximated by
$\abstr{x_1},\abstr{x_2},\abstr{x_3}\dots$ with
$\forall i~ x_i \subseteq \gamma(\abstr{x_i})$.
\pfoilhead{In practice}
Iterations use the block-structure of the source code\\
Saves memory: one abstract environment per nested loop\\
Trace partitioning
\intercalaire{An open system}
\pfoilhead{Simple I/O}
System takes scalar values (integers, floats) as inputs\\
Simple interval constraints known on them: input cannot go beyond [-10, 10]
Prove "worst case" behaviour
\pfoilhead{Physical feedback}
Industrial users would like user-level properties such as "in normal conditions,
FOOBAR\_X is in [-30, 30]"
At present Astr\'ee does worst case behavior, say [-1000, 1000]
Reconstructed scenarios of inputs are often physically impossible
Future work:
How about an abstract model of the physical world? (Dynamic system, real numbers)
\pfoilhead{Intelligent I/O}
Previous analyses: system does I/O through simple memory-mapped registers, aka \texttt{volatile} variables
Increasing use of intelligent I/O controllers, coprocessors executing I/O programs (industrial req.)
Example: USB controller, OHCI spec, asynchronously updates linked lists (for worklists) and does bus-mastering DMA
Only vague specification for I/O controller (OHCI spec = book in English, many details somewhat vague) $\Rightarrow$
model as highly nondeterministic routines
\pfoilhead{Asynchronous model}
One main big synchronous process\\
Some known memory-mapped I/O zones (if necessary through preanalysis)\\
Some small nondeterministic routines processed asynchronously (e.g. "perhaps process one item from the worklist")
Semantics: before each time when the main process reads/writes MMIO or DMA zones, asynchronous routines may execute an unknown number of times
Each async routine implements some atomic step of the I/O controller (e.g. "transmit one word of data", "update head worklist pointer", etc.)
\pfoilhead{Asynchronous analysis}
Pointer analysis detects interference with MMIO/DMA zones\\
Read/writes to these results in fixpoint iterations of async routines
Work in progress
Difficulty: I/O drivers are often very dirty
Comparison: Microsoft SLAM analyzes drivers, but apparently with ideal integers, does not handle many constructs, and totally ignores I/O controllers
\end{document}