Advisors: Laurent Mounier and Marie-Laure Potet
Context:
Detection and analysis of software vulnerabilities is an active research area in terms of computer security. Its purpose is to find programming flaws that could be exploited by an external user. Although the discovery of 0-day vulnerability is still mainly achieved through human expertise, the design and implementation of tools able to assist experts in their tasks is a major challenge in this area.
One of the techniques frequently used within such tools is the so-called concolic execution,which consists in generalizing some "concrete" program executions by considering some input variables as "symbolic". Instead of being instantiated by concrete values, these variables are represented by symbolic expressions that can be evaluated by a solver. Thus, this technique allows to handle sets of program executions sharing a same control flow. However, current analysis techniques are usually based on coarse-grain program coverage criteria, which are not suitable to vulnerability detection. A typical example is buffer overflows (BoF), requiring extensive loop activations to reveal exploitable behaviors.
Work objective:
The objective of this work is to study several research directions allowing to generalize a given concolic execution to a set of concrete executions corresponding to several loop iterations (i.e., loop summarization), in order to satisfy some exploitability condition. One of the expected outcome is a methodology to be evaluated on classical buffer overflow examples.
Existing approaches for loop summarization consist in synthesizing loop invariants (generalizing the path condition associated with a loop body to the set of all loop iterations). Several techniques have been proposed so far to produce such invariants, for instance:
- The synthesis of linear relations between induction variables used within the loop, possibly extended with symbolic iteration counters. Such relations can be obtained either using abstract interpretations (as in [7]), or by producing extra constraints during a concolic execution (as in [2]).
- Iterative bounded loop unrolling, in order to produce interpolants, namely logical formula expressing sufficient conditions to guarantee the unsatisfiability of given constraints (corresponding for instance to undesired behaviors). This approach is proposed by [5,6] in the framework of program verification.
Possible road map, and expected results:
- Evaluate some existing loop summarization techniques in the particular context of exploitable BoF vulnerability detection, using typical benchmarks ;
- Propose a resulting solution that could be automated using existing (source-level) program analysis platforms, such as Frama-C [3] or LLVM [4].
- Extend this approach at the binary level, and/or to target other vulnerabilities like Use-after-Free [8].
This work will take place within the French ANR BinSec project [1], dedicated to the binary code analysis techniques for software security (vulnerability analysis, malware detection), and gathering several academic and industrial partners.It will take place in the Verimag/DCS team, having a large experience in security and code analysis.
This research internship is well suited for a student interested by vulnerability analysis techniques, compilation, and logic. It can be followed by a PhD on the same topic.
References:
[1] http://binsec.gforge.inria.fr/
[2] Automatic partial loop summarization in dynamic test generation. Patrice Godefroid, Daniel Luchaup. ISSTA 2011
[4] llvm.org
[5] Lazy annotation for program testing and verification. Kenneth L. McMillan. CAV 2010
[6] Unbounded symbolic execution for program verification. Joxan Jaffar, Jorge A. Navas and Andrew E. Santosa. RV 2011
[7] Loop-Extended Symbolic Execution on Binary Programs. Prateek Saxena, Pongsin Poosankam, Stephen McCamant, and Dawn Song. ISSTA 2009
[8] Memory errors: the past, the present, and the future. Victor van der Veen, Nitish dutt-Sharma, Lorenzo Cavallaro and Herbert Bos. RAID’12