Auditorium (Building IMAG)
13 July 2021 - 15h00
Optimized and formally-verified compilation for a VLIW processor (Phd Defense)
by Cyril Six from Kalray & Verimag
Abstract: Software programs are used for many critical roles. A bug in those can
have a devastating cost, possibly leading to the loss of human lives.
Such bugs are usually found at a source level (which can be ruled out
with source-level verification methods), but they can also be inserted
by the compiler unknowingly. CompCert is the first compiler with a
formal proof of correctness: compiled programs are proven to behave the
same as their source programs. However, because of the challenges
involved in proving compiler optimizations, CompCert only has a limited
number of them. As such, CompCert usually generates low-performance code
compared to classical compilers such as GCC. While this may not
significantly impact out-of-order architectures such as x86, on in-order
architectures, particularly on VLIW processors, the slowness is
significant (code running half as fast as GCC -O2). On VLIW processors,
the intra-level parallelism is explicit and specified in the assembly
code through “bundles†of instructions: the compiler must bundlize
instructions to achieve good performance.
In this thesis, we identify, investigate, implement and formally verify
several classical optimizations missing in CompCert. We start by
introducing a formal model for VLIW bundles execution on an interlocked
core and generate those bundles through a postpass (after register
allocation) scheduling. Then, we introduce a prepass (before register
allocation) superblock scheduling, implementing static branch prediction
and tail-duplication along the way. Finally, we further increase the
performance of our generated code by implementing loop unrolling, loop
rotation and loop peeling – the latter being used for Loop-Invariant
Code Motion. These transformations are verified by translation
validation, some of them with hash-consing to achieve reasonable
compilation time. We evaluate each introduced optimization on
benchmarks, including Polybench and TACleBench, on the KV3 VLIW core,
ARM Cortex A53, and RiscV “Rocket†core. Thanks to this work, our
version of CompCert is now only 16% slower (respectively 12% slower and
30% slower) than GCC -O2 on the KV3 (respectively ARM and RiscV),
instead of 50% (respectively 38% and 45%).
Keywords: Embedded, Optimization, Formal Verification, VLIW, Compilation
Jury:
Xavier Leroy, rapporteur, Collège de France
Adam Chlipala, rapporteur, MIT
Laure Gonnord, Université Lyon 1
Delphine Demange, Université Rennes 1
Alain Girault, INRIA
David Monniaux, directeur, CNRS
The defense will also be viewable by videoconference.