Foundations for BPF device integration: DSL/IR/compiler part #107

daphne-eu · 2021-08-24T12:41:17Z

In GitLab by @pdamme on Aug 24, 2021, 14:41

We want to support IO kernels for BPF devices (computational storage). To support asynchronous IO in this context, we need to carry information on an open file between kernel calls. Thus, open/close operations are required. This is a kind of sideways access, later the compiler should support it automatically. In particular, the workflow looks as follows:

For POSIX:

open(filename) -> File
readCsv(File) -> Matrix
close(File) -> void

For BPF

open(dev) -> Target
open(Target, filename) -> Descriptor
readCsv(Descriptor) -> Matrix
close(Target) -> void

The device in the open-call is essentially a string.
There will be different kinds of descriptors, based on the underlying hardware and techniques. In that sense, File could be considered a variant of a Descriptor, or at least they could have a common base class, such that an IO kernel (e.g. readCsv always takes the same kind of input).

Thanks @niclashedam for bringing up this topic and analyzing the requirements.

This issue is about the DaphneDSL/DaphneIR/compiler integration. We need additional built-in functions, IR operations/types, and these must be supported by the Daphne compiler.

This issue is closely related to # on the run-time part (#108).

The text was updated successfully, but these errors were encountered:

daphne-eu · 2021-08-24T13:45:51Z

In GitLab by @pdamme on Aug 24, 2021, 15:45

mentioned in commit 2337047

daphne-eu · 2021-08-27T08:33:46Z

In GitLab by @bonnet-p on Aug 27, 2021, 10:33

Dear Patrick, all,
A few comments on this issue:
(1) There is a silent assumption of an underlying file system in the first paragraph and throughout the text. This is neither general nor necessary. We should be careful not to assimilate I/Os and files. These are two different levels of abstractions. The first question is what kind of abstraction is needed by the run-time. Do we need a universal storage abstraction in between tables/matrices(dense and sparse) and NVMe? My initial take would be no.
(2) I think we can assume that NVMe will provide command sets for offloading BPF programs (CS command set) and reading/writing data (IO command set) on namespaces of different kinds, e.g., LBA, ZNS or KV (all equipped with administration command sets for identification and async operations). The question for us is whether this should be exposed to the runtime or whether some complexity should be abstracted. What we do not want to do is to introduce concepts or mix abstraction levels. In particular, files and BPF programs have nothing to do with each other.
(3) An additional issue is how to deal with storage tiering. Should various storage tiers be exposed to the run-time or should data placement/movement be managed separately from the run-time? This is a fundamental design decision.
(4) I think we can find (a) generic data structures to (i) name devices/namespaces and (ii) carry namespace context across asynchronous calls and a (b) generic open/async call/close interface across file systems, raw devices equipped or not with computational storage processor. But do we need separate kernels for the three operations in (b)? This is something I do not understand.
Best,
Philippe.

daphne-eu assigned pdamme Mar 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Foundations for BPF device integration: DSL/IR/compiler part #107

Foundations for BPF device integration: DSL/IR/compiler part #107

daphne-eu commented Aug 24, 2021

daphne-eu commented Aug 24, 2021

daphne-eu commented Aug 27, 2021

Foundations for BPF device integration: DSL/IR/compiler part #107

Foundations for BPF device integration: DSL/IR/compiler part #107

Comments

daphne-eu commented Aug 24, 2021

daphne-eu commented Aug 24, 2021

daphne-eu commented Aug 27, 2021