Stream processing is at the heart of many applications, such as network traffic monitoring, data analysis or bioinformatics. In each of these domains, basic computation blocks can easily be defined, but composing them efficiently remains a complex task, when addressing workloads with terabytes of data. A solution to manage those computation-intensive applications is to exploit the parallelism potential of the algorithms, which can be done through vectorial programs, i.e., programs where instructions are applied to vectored data structure, instead of single elements.
To execute vectorial, we need vectorial machines. Vectorial machines can be built using Single-Instruction Multiple Data architectures, which support applying the same instruction to multiple data. Vectorial extensions have been proposed for several families of standard CPUs (SSE/AVX for Intel x86, SVE for arm, . . . ). Several projects have shown that vectorial programs can be used for efficient computations, not only in domains with operates with “regular data”, but also in stream processing.
However, for stream processing, solutions are mainly written by hand (using C with intrinsics), and strongly depend on the chosen backend. The approach stills lacks a proper formalisation of the expressivity of vectorial backends, so that to compare between them and/or prove the correctness of future compilation schemes.
Operational sematics of languages provides us theoritical tooling to reason on such backends. This internship will be part in a national research project, which aims at developping a programming language fitted to write stream processing applications, with a specific compilator to generate efficient assembly codes for vectorial machines. One of the first steps of the work would be to capture the expressivity of such vectorial machines (Intel AVX, and the more recent free proposal, RISC-V “V").
This internship is hence a preliminary step towards a RISC-V “V” backend, where we want to assert that the existing vectorial extension for RISC-V includes every instruction needed to compile stream processing applications.
The goal of this internship is to setup a framework to assert that the RISC-V “V” extension is suitable for the compilation of stream processing application.
It is thus expected that the applicate will:
- study the existing literature to define (or reuse) an operational semantics of the vectorial extension of RISC-V ISA;
- implement several examples of stream processing applications using this semantics, to demonstrate the suitability;
- if needed, propose extensions of the ISA (in the operational semantics) to fit the needs of stream processing.
For more information, please see the proposal attached. If you are interested in this internship, do not hesitate to contact the advisors.