Seminar Room, ground floor (Building IMAG)
31 May 2022 - 09h30
Safe Implementation of Hard Real-Time Applications on Many-Core Platforms (Phd Defense)
by Matheus Schuh from Verimag / Kalray
Abstract: Hard real-time systems are designed to be functionally correct, but also require
the guarantee of timing constraints. Completing the task at hand within a given
deadline is part of the specification and failing to accomplish this can lead to
serious consequences. Some examples of such systems are the central command of
an airplane, electronic control units inside a car and a power plant monitoring
room.
Multi/many-core architectures tend to be used in such systems. On top of having
multiples cores that can run programs concurrently, these SoCs may contain
several cache levels, local and global shared memories, buses or
interconnections for communication. Moreover, the cores themselves have dynamic
components in their pipeline such as branch prediction or instruction
reordering. All these features are extremely useful to boost the average
performance, but raise huge problems for hard real-time systems, as they
introduce timing unpredictability.
The predictability of these systems is directly connected to being able to
compute the worst-case response time of an application on a given architecture.
Additionally, on multi/many-core architectures, when numerous cores access
shared hardware resources at the same time, they interfere with each other,
mutually slowing them down. To guarantee the respect of timing constraints of a
real-time system, the interference sources must be identified and taken into
account.
The use of multi/many-core architecture on safety-critical real-time systems is
increasing in the industry, as well as being an intense topic of research. This
thesis provides solutions and comparative studies on several problems raised by
the implementation of critical applications on such platforms. We focus on
providing an integrated approach in order to confidently use multi/many-core
architectures for hard real-time systems. This integrated workflow covers the
choice of an execution model, a strategy to map and schedule tasks and a
hardware model to provide safe bounds on the response time.
The critical application to be analyzed is represented in the form of a Directed
Acyclic Graph (DAG), with precedence constraints between nodes and explicit
communication. This application can be issued from synchronous data flow
languages or any other language or model-based development environment providing
a DAG at the end of the compilation process. Most of our case studies come from
the SCADE tool, some of them being industrial case studies.
The target architecture, the Kalray MPPA3 is a COTS processor but with
interesting characteristics that make it a good candidate for real-time systems.
At the core level, it has in-order pipeline and private caches with a
predictable replacement policy. At the cluster level, it provides low latency
scratchpad memory and predictable arbitration policies. At the SoC level, it
provides an AXI bus relying the different clusters with constant traversal time.
We present and explore several execution models that help to provide predictable
execution on multi-core platform. They are applied to the many-core processors
Kalray MPPA2 and MPPA3 and compared to enlighten the best approach in terms of
task phased execution and memory access restrictions. We show that in our
context, the isolation of tasks executed concurrently is too expensive in terms
of response-time. The best execution model corresponds to a development process
where interference analysis between task memory accesses are integrated in the
implementation step.
An additional improvement that has a significant impact on the overall response
time of a program is the task mapping and scheduling on a given platform. With a
high number of cores and clusters, it has become essential to provide a good
positioning and ordering, at the risk of under utilizing the potential
parallelism. Therefore we provide an initial work with multiple steps, taking
into account the local memory use, communication cost and clusterization. We
show that our memory use criteria is a good one to be used in future work.
The Kalray MPPA3 is the main target architecture of this work and for its use in
hard real-time systems, the arbitration points and shared resource access delay
have been analyzed. A hardware model of intra and inter-cluster memory accesses
is developed, combined with a response time analysis framework.
Throughout this thesis several extensions and improvements were made to
different industrial and academic tools: SCADE code generator, multi-core
interference analysis and a high-level hardware model.
Keywords: Many-Core, Synchronous Data-Flow Languages, Critical Systems, Interference analysis, Worst-case execution time analysis
Jury:
Claire Pagetti, Rapporteure, ONERA
Eduardo Tovar, Rapporteur, ISEP-IPP
Isabelle Puaut, Examinatrice, Université de Rennes
Joël Goossens, Examinateur, Université Libre de Bruxelles
Frédéric Pétrot, Examinateur, Grenoble INP
Claire Maiza, Directrice, Grenoble INP
Pascal Raymond, Co-encadrant, CNRS
Benoît Dinechin, Co-encadrant, Kalray
The thesis defense will also be transmitted via a Zoom meeting:
https://grenoble-inp.zoom.us/j/99584183331
Meeting ID: 995 8418 3331
Passcode: 225011