Browse Source

Redesign of the `docs` as an `mdBook` (#61)

tags/v0.0.0-test.4
Xavier Tao GitHub 3 years ago
parent
commit
12669e191d
No known key found for this signature in database GPG Key ID: 4AEE18F83AFDEB23
19 changed files with 587 additions and 65 deletions
  1. +32
    -0
      .github/workflows/gh-pages.yml
  2. +2
    -2
      README.md
  3. +1
    -0
      docs/.gitignore
  4. +6
    -0
      docs/book.toml
  5. +0
    -7
      docs/design/README.md
  6. +25
    -0
      docs/src/SUMMARY.md
  7. +107
    -0
      docs/src/c-api.md
  8. +32
    -0
      docs/src/communication-layer.md
  9. +2
    -48
      docs/src/dataflow-config.md
  10. +117
    -0
      docs/src/getting-started.md
  11. +29
    -0
      docs/src/installation.md
  12. +30
    -0
      docs/src/introduction.md
  13. +2
    -8
      docs/src/library-vs-framework.md
  14. +0
    -0
      docs/src/logo.svg
  15. +0
    -0
      docs/src/overview.md
  16. +0
    -0
      docs/src/overview.svg
  17. +85
    -0
      docs/src/python-api.md
  18. +117
    -0
      docs/src/rust-api.md
  19. +0
    -0
      docs/src/state-management.md

+ 32
- 0
.github/workflows/gh-pages.yml View File

@@ -0,0 +1,32 @@

name: github pages

on:
push:
branches:
- main
pull_request:

jobs:
deploy:
runs-on: ubuntu-20.04
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
steps:
- uses: actions/checkout@v2

- name: Setup mdBook
uses: peaceiris/actions-mdbook@v1
with:
mdbook-version: '0.4.10'
# mdbook-version: 'latest'

- run: mdbook build docs

- name: Deploy
uses: peaceiris/actions-gh-pages@v3
if: ${{ github.ref == 'refs/heads/main' }}
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: ./docs/book

+ 2
- 2
README.md View File

@@ -1,5 +1,5 @@
<p align="center">
<img src="./docs/design/logo.svg" width="400">
<img src="./docs/src/logo.svg" width="400">
</p>

<h3 align="center">
@@ -20,7 +20,7 @@ Composability as:
- [ ] language-agnostic:
- [x] Rust
- [x] C
- [ ] Python
- [x] Python
- [ ] Isolated operator and node that can be reused.

Low latency as:


+ 1
- 0
docs/.gitignore View File

@@ -0,0 +1 @@
book/

+ 6
- 0
docs/book.toml View File

@@ -0,0 +1,6 @@
[book]
authors = ["Haixuan Xavier Tao", "Philipp Oppermann"]
language = "en"
multilingual = false
src = "src"
title = "dora-rs"

+ 0
- 7
docs/design/README.md View File

@@ -1,7 +0,0 @@
# Design

This folder contains our work-in-progress design documents:

- [**Overview:**](overview.md) A high level overview of dora's different components and the differences between operators and custom nodes.
- [**Dataflow Configuration:**](dataflow-config.md) Specification of dora's YAML-based dataflow configuration format.
- [**State Management:**](state-management.md) Describes dora's state management capabilities for state sharing and recovery.

+ 25
- 0
docs/src/SUMMARY.md View File

@@ -0,0 +1,25 @@
# Summary

[Introduction](./introduction.md)

---

# User Guide

- [installation](./installation.md)
- [Getting started](./getting-started.md)

# Reference Guide


- [Overview](overview.md) A high level overview of dora's different components and the differences between operators and custom nodes.
- [Dataflow Configuration](dataflow-config.md) Specification of dora's YAML-based dataflow configuration format.
- [Rust API](rust-api.md)
- [C API](c-api.md)
- [Python API](python-api.md)

# Brainstorming Ideas

- [State Management](state-management.md) Describes dora's state management capabilities for state sharing and recovery.
- [Library vs Framework](library-vs-framework.md)
- [Middleware Layer Abstraction](./communication-layer.md)

+ 107
- 0
docs/src/c-api.md View File

@@ -0,0 +1,107 @@
# C API

## Operator

The operator API is a framework for you to implement. The implemented operator will be managed by `dora`. This framework enable us to make optimisation and provide advanced features.

The operator definition is composed of 3 functions, `dora_init_operator` that initialise the operator and its context. `dora_drop_operator` that free the memory, and `dora_on_input` that action the logic of the operator on receiving an input.

```c
int dora_init_operator(void **operator_context)
{
// allocate a single byte to store a counter
// (the operator context pointer can be used to keep arbitrary data between calls)
void *context = malloc(1);

char *context_char = (char *)context;
*context_char = 0;

*operator_context = context;

return 0;
}

void dora_drop_operator(void *operator_context)
{
free(operator_context);
}

int dora_on_input(
const char *id_start,
size_t id_len,
const char *data_start,
size_t data_len,
const int (*output_fn_raw)(const char *id_start,
size_t id_len,
const char *data_start,
size_t data_len,
const void *output_context),
void *output_context,
const void *operator_context)
{
// handle the input ...
// (sending outputs is possible using `output_fn_raw`)
// (the `operator_context` is the pointer created in `dora_init_operator`, i.e., a counter in our case)
}
```
### Try it out!

- Create an `operator.c` file:
```c
{{#include ../../examples/c-dataflow/operator.c}}
```

{{#include ../../examples/c-dataflow/README.md:40:46}}

- Link it in your graph as:
```yaml
{{#include ../../binaries/coordinator/examples/mini-dataflow.yml:47:52}}
```

## Custom Node

The custom node API allow you to integrate `dora` into your application. It allows you to retrieve input and send output in any fashion you want.

#### `init_dora_context_from_env`

`init_dora_context_from_env` initiate a node from environment variables set by `dora-coordinator`

```c
void *dora_context = init_dora_context_from_env();
```

#### `dora_next_input`

`dora_next_input` waits for the next input. To extract the input ID and data, use `read_dora_input_id` and `read_dora_input_data` on the returned pointer.

```c
void *input = dora_next_input(dora_context);

// read out the ID as a UTF8-encoded string
char *id;
size_t id_len;
read_dora_input_id(input, &id, &id_len);

// read out the data as a byte array
char *data;
size_t data_len;
read_dora_input_data(input, &data, &data_len);
```

#### `dora_send_output`

`dora_send_output` send data from the node.

```c
char out_id[] = "tick";
char out_data[] = {0, 0, 0};
dora_send_output(dora_context, out_id, strlen(out_id), &out_data, sizeof out_data);
```
### Try it out!

- Create an `node.c` file:
```c
{{#include ../../examples/c-dataflow/node.c}}
```

{{#include ../../examples/c-dataflow/README.md:26:35}}

+ 32
- 0
docs/src/communication-layer.md View File

@@ -0,0 +1,32 @@
# [Middleware (communication) layer abstraction (MLA)](https://github.com/dora-rs/dora/discussions/53)

`dora` needs to implement MLA as a separate crate to provides a middleware abstraction layer that enables scalable, high performance communications for inter async tasks, intra-process (OS threads), interprocess communication on a single computer node or between different nodes in a computer network. MLA needs to support different communication patterns:
- publish-subscribe push / push pattern - the published message is pushed to subscribers
- publish-subscribe push / pull pattern - the published message is write to storage and later pulled by subscribers
- Request / reply pattern
- Point-to-point pattern
- Client / Server pattern

The MLA needs to abstract following details:

- inter-async tasks (e.g., tokio channels), intraprocess (OS threads, e.g., shared memory), interprocess and inter-host / inter-network communication
- different transport layer implementations (shared memory, UDP, TCP)
- builtin support for multiple serialization / deserialization protocols, e.g, capnproto, protobuf, flatbuffers etc
- different language bindings to Rust, Python, C, C++ etc
- telemetry tools for logs, metrics, distributed tracing, live data monitoring (e.g., tap a live data), recording and replay

Rust eco-system has abundant crates to provide underlaying communications, e.g.,:
- tokio / crossbeam provides different types of channels serving different purpose: mpsc, oneshot, broadcast, watch etc
- Tonic provides gRPC services
- Tower provides request/reply service
- Zenoh middleware provides many different pub/sub capabilities

MLA also needs to provide high level APIs:
- publish(topic, value, optional fields):- optional fields may contain senders' identify to help MLA logics to satify above requirements
- subscriber(topic, optional fields)-> future streams
- put(key, value, optional fields)
- get(key, optional fields) -> value
- send(key, msg, optional fields)
- recv(key, optional fields)->value

More info here: [#53](https://github.com/dora-rs/dora/discussions/53)

docs/design/dataflow-config.md → docs/src/dataflow-config.md View File

@@ -96,57 +96,11 @@ Each operator must specify exactly one implementation. The implementation must f

## Example

TODO:


```yaml
nodes:
- id: main
name: Main Node
description: Implements the main task

operators:
# sources (only outputs)
- id: timer
name: Clock timer # optional, human-readable name
description: Send the time.
shared_library: path/to/timer.so
outputs:
- time
- id: camera-uuid-1
name: Front camera
python: camera.py
outputs:
- image
- metadata
# actions (inputs and outputs)
- id: timestamp
description: Add a watermark on the camera.
python: timestamp.py
inputs:
timestamp: timer/time
image: camera-uuid-1/image
outputs:
- image # with timestamp watermark

# sinks (only inputs)
- id: logger
description: Sink the data into the logger.
python: logger.py
inputs:
image: timestamp/image
camera: camera-uuid-2/metadata

- id: camera-uuid-2
name: Back camera
run: camera_driver --kind back
outputs:
- image
- metadata
{{#include ../../binaries/coordinator/examples/mini-dataflow.yml}}
```


## TODO: Integration with ROS 1/2

To integrate dora-rs operators with ROS1 or ROS2 operators, we plan to provide special _bridge operators_. These operators act as a sink in one dataflow framework and push all messages to a different dataflow framework, where they act as source.

+ 117
- 0
docs/src/getting-started.md View File

@@ -0,0 +1,117 @@
### Create a Rust workspace

- Initiate the workspace with:

```bash
mkdir my_first_dataflow
cd my_first_dataflow
```

- Create the Cargo.toml file that will configure the entire workspace:

`Cargo.toml`
```toml
[workspace]

members = [
"source_timer",
]
```

### Write your first node

Let's write a node which sends the current time periodically. Let's make it after 100 iterations. The other nodes/operators will then exit as well because all sources closed.

- Generate a new Rust binary (application):

```bash
cargo new source_timer
```

with `Cargo.toml`:
```toml
[package]
name = "rust-node"
version = "0.1.0"
edition = "2021"
license = "Apache-2.0"

[dependencies]
dora-node-api = { git = "https://github.com/dora-rs/dora" }
time = "0.3.9"
```

with `src/main.rs`:
```rust
{{#include ../../binaries/coordinator/examples/nodes/rust/source_timer.rs}}
```

### Write your second node

Let's write a `logger` which will print incoming data.

- Generate a new Rust binary (application):

```bash
cargo new sink_logger
```

with `Cargo.toml`:
```toml
[package]
name = "sink_logger"
version = "0.1.0"
edition = "2021"
license = "Apache-2.0"

[dependencies]
dora-node-api = { git = "https://github.com/dora-rs/dora" }
time = "0.3.9"
```

with `src/main.rs`:
```rust
{{#include ../../binaries/coordinator/examples/nodes/rust/sink_logger.rs}}
```

- And modify the root `Cargo.toml`:
```toml=
[workspace]

members = [
"source_timer",
"sink_logger"
]
```


### Write a graph definition

Let's write the graph definition so that the nodes know who to communicate with.

`mini-dataflow.yml`
```yaml
communication:
zenoh:
prefix: /foo

nodes:
- id: timer
custom:
run: cargo run --release --bin source_timer
outputs:
- time

- id: logger
custom:
run: cargo run --release --bin sink_logger
inputs:
time: timer/time
```

### Run it!

- Run the `mini-dataflow`:
```bash
dora-coordinator run mini-dataflow.yml
```

+ 29
- 0
docs/src/installation.md View File

@@ -0,0 +1,29 @@
# Installation

This project is in early development, and many features have yet to be implemented with breaking changes. Please don't take for granted the current design. The installation process will be streamlined in the future.

### 1. Compile the dora-coordinator

The `dora-coordinator` is responsible for reading the dataflow descriptor file and launching the operators accordingly.

Build it using:
```bash
git clone https://github.com/dora-rs/dora.git
cd dora
cargo build -p dora-coordinator --release
```

### 2. Compile the dora-runtime for operators

The `dora-runtime` is responsible for managing a set of operators.
```bash
cargo build -p dora-runtime --release
```

### 3. Add those binaries to your path

This step is optional. You can also refer to the executables using their full path or copy them somewhere else.

```bash
export PATH=$PATH:$(pwd)/target/release
```

+ 30
- 0
docs/src/introduction.md View File

@@ -0,0 +1,30 @@
# Welcome to `dora`!

`dora` goal is to be a low latency, composable, and distributed data flow.

By using `dora`, you can define robotic applications as a graph of nodes that can be easily swapped and replaced. Those nodes can be shared and implemented in different languages such as Rust, Python or C. `dora` will then connect those nodes and try to provide as many features as possible to facilitate the dataflow.

## ✨ Features that we want to provide

Composability as:
- [x] `YAML` declarative programming
- [x] language-agnostic:
- [x] Rust
- [x] C
- [x] Python
- [ ] Isolated operators and nodes that can be reused.

Low latency as:
- [x] written in <i>...Cough...blazingly fast ...Cough...</i> Rust.
- [ ] Minimal abstraction, close to the metal.

Distributed as:
- [x] PubSub communication with [`zenoh`](https://github.com/eclipse-zenoh/zenoh)
- [x] Distributed telemetry with [`opentelemetry`](https://github.com/open-telemetry/opentelemetry-rust)



## ⚖️ LICENSE

This project is licensed under Apache-2.0. Check out [NOTICE.md](NOTICE.md) for more information.


docs/design/rust-client.md → docs/src/library-vs-framework.md View File

@@ -1,10 +1,4 @@
# Rust Client Design

## Brainstorm

framework vs library

### Framework
# Framework

- Runtime process
- Talks with other runtime processes
@@ -27,7 +21,7 @@ framework vs library
- by runtime -> aggregation specified in config file
- by operator -> custom handling possible

### Library
# Library

- All sources/operator/sinks are separate processes that link a runtime library
- "Orchestrator" process

docs/design/logo.svg → docs/src/logo.svg View File


docs/design/overview.md → docs/src/overview.md View File


docs/design/overview.svg → docs/src/overview.svg View File


+ 85
- 0
docs/src/python-api.md View File

@@ -0,0 +1,85 @@
# Python API

## Operator

The operator API is a framework for you to implement. The implemented operator will be managed by `dora`. This framework enable us to make optimisation and provide advanced features. It is the recommended way of using `dora`.

An operator requires an `on_input` method and requires to return a `DoraStatus` of 0 or 1, depending of it needs to continue or stop.

```python
class Operator:
def on_input(
self,
input_id: str,
value: bytes,
send_output: Callable[[str, bytes], None],
) -> DoraStatus:
```

> For Python, we recommend to allocate the operator on a single runtime. A runtime will share the same GIL with several operators making those operators run almost sequentially. See: [https://docs.rs/pyo3/latest/pyo3/marker/struct.Python.html#deadlocks](https://docs.rs/pyo3/latest/pyo3/marker/struct.Python.html#deadlocks)
### Try it out!

- Create an operator python file called `op.py`:
```python
{{#include ../../examples/python-operator/op.py}}
```

- Link it in your graph as:
```yaml
{{#include ../../binaries/coordinator/examples/graphs/mini-dataflow.yml:67:73}}
```

## Custom Node

The custom node API allow you to integrate `dora` into your application. It allows you to retrieve input and send output in any fashion you want.
#### `Node()`

`Node()` initiate a node from environment variables set by `dora-coordinator`

```python
from dora import Node()

node = Node()
```

#### `.next()` or `__next__()` as an iterator

`.next()` gives you the next input that the node has received. It blocks until the next input becomes available. It will return `None` when all senders has been dropped.

```python
input_id, value = node.next()

# or

for input_id, value in node:
```

#### `.send_output(output_id, data)`

`send_output` send data from the node.

```python
node.send_output("string", b"string")
```


### Try it out!

- Install python node API:
```bash
cd apis/python/node
python3 -m venv .env
source .env/bin/activate
pip install maturin
maturin develop
```

- Create a python file called `printer.py`:
```python
{{#include ../../binaries/coordinator/examples/nodes/python/printer.py}}
```

- Link it in your graph as:
```yaml
{{#include ../../binaries/coordinator/examples/graphs/python_test.yml:12:17}}
```

+ 117
- 0
docs/src/rust-api.md View File

@@ -0,0 +1,117 @@
# Rust API

## Operator

The operator API is a framework for you to implement. The implemented operator will be managed by `dora`. This framework enable us to make optimisation and provide advanced features. It is the recommended way of using `dora`.

An operator requires to be registered and implement the `DoraOperator` trait. It is composed of an `on_input` method that defines the behaviour of the operator when there is an input.

```rust
use dora_operator_api::{register_operator, DoraOperator, DoraOutputSender, DoraStatus};

register_operator!(ExampleOperator);

#[derive(Debug, Default)]
struct ExampleOperator {
time: Option<String>,
}

impl DoraOperator for ExampleOperator {
fn on_input(
&mut self,
id: &str,
data: &[u8],
output_sender: &mut DoraOutputSender,
) -> Result<DoraStatus, ()> {
```

### Try it out!

- Generate a new Rust library

```bash
cargo new example-operator --lib
```

`Cargo.toml`
```toml
{{#include ../../examples/example-operator/Cargo.toml}}
```

`src/lib.rs`
```rust
{{#include ../../examples/example-operator/src/lib.rs}}
```

- Build it:
```bash
cargo build --release
```

- Link it in your graph as:
```yaml
{{#include ../../binaries/coordinator/examples/mini-dataflow.yml:38:46}}
```

This example can be found in `examples`.

## Custom Node

The custom node API allow you to integrate `dora` into your application. It allows you to retrieve input and send output in any fashion you want.
#### `DoraNode::init_from_env()`

`DoraNode::init_from_env()` initiate a node from environment variables set by `dora-coordinator`

```rust
let node = DoraNode::init_from_env().await?;
```

#### `.inputs()`

`.inputs()` gives you a stream of input that you can access using `next()` on the input stream.

```rust
let mut inputs = node.inputs().await?;
```

#### `.send_output(output_id, data)`

`send_output` send data from the node.

```rust
node.send_output(&data_id, data.as_bytes()).await?;
```

### Try it out!

- Generate a new Rust binary (application):

```bash
cargo new source_timer
```

```toml
[package]
name = "source_timer"
version = "0.1.0"
edition = "2021"
license = "Apache-2.0"

[dependencies]
dora-node-api = { path = "../../apis/rust/node" }
time = "0.3.9"
```

`src/main.rs`
```rust
{{#include ../../binaries/coordinator/examples/nodes/rust/source_timer.rs}}
```

- Link it in your graph as:
```yaml
- id: timer
custom:
run: cargo run --release
outputs:
- time
```

docs/design/state-management.md → docs/src/state-management.md View File


Loading…
Cancel
Save