diff --git a/examples/multiple-daemons/README.md b/examples/multiple-daemons/README.md index be027a1f..fc08b50c 100644 --- a/examples/multiple-daemons/README.md +++ b/examples/multiple-daemons/README.md @@ -29,3 +29,63 @@ Execute the following steps in this directory: - As above, you can specify a `RUST_LOG` env variable for more output if you like - Start the dataflow through `dora start dataflow.yml` +# Usage Across Multiple Machines + +1. Start the dora coordinator on some machine that is reachable by IP by all other machines + ```bash + dora coordinator + ``` + - If you plan to run dora in the cloud, the machine that runs the `dora coordinator` needs to have a public IP. + - Set a `RUST_LOG` environment variable to get more output on the command line. For example, `RUST_LOG=debug dora coordinator` reports when daemons connect and when dataflow start commands are received. +2. Start dora daemon instances on all machines that should run parts of the dataflow. + - If all daemons and the coordinator are **in the same network** (e.g. same WiFi), use the following command: + ```bash + RUST_LOG=debug dora daemon --coordinator-addr --machine-id + ``` + Replace `` with the (public) IP address of the coordinator. Replace `` with some unique identifier for each machine. This machine ID will be used to assign nodes to machines in the `dataflow.yml` file. + + You can **omit the `--machine-id` argument on one machine**. This will be the default machine for nodes that don't specify deploy information. + + The `RUST_LOG=debug` environment variable is optional. It enables some interesting debug output about the dataflow run. + - If daemons/coordinator are in **different networks** (e.g. behind a [NAT](https://en.wikipedia.org/wiki/Network_address_translation)), you need to set up one or multiple [zenoh routers](https://zenoh.io/docs/getting-started/deployment/#zenoh-router). This requires running a [`zenohd`](https://zenoh.io/docs/getting-started/installation/) instance for every subnet. This is also possible [using a Docker container](https://zenoh.io/docs/getting-started/quick-test/). + + Once your `zenohd` instances are running, you can start the `dora daemon` instances as before, but now with a `ZENOH_CONFIG` environment variable set: + ```bash + ZENOH_CONFIG= RUST_LOG=debug dora daemon --coordinator-addr --machine-id + ``` + Replace `` with the path to a [zenoh configuration file](https://zenoh.io/docs/manual/configuration/#configuration-files) that lists the corresponding `zenohd` instance(s) under `connect.endpoints`. + +3. In your `dataflow.yml` file, add an `_unstable_deploy` key to all nodes that should not run on the default machine: + ```yml + - id: example-node + _unstable_deploy: + machine: laptop # this should match the of one daemon + path: /home/user/dora/target/debug/rust-dataflow-example-node + inputs: [...] + outputs: [...] + ``` + The `path` field should be an absolute path because we don't have a clear default for the working directory on remote nodes yet. +4. Build all the nodes on their target machines, or build them locally and copy them over. + + Dora does **not** do any deployment of executables yet. You need to copy the executables/Python code yourself. +5. Use the CLI to start the dataflow: + ```bash + RUST_LOG=debug dora start --coordinator-addr dataflow.yml + ``` + + Replace `` with the (public) IP address of the coordinator. The `--coordinator-addr` argument is optional if the coordinator is running on the same machine as the CLI. + + As above, the `RUST_LOG=debug` env variable is optional. You can omit it if you're not interested in the additional output. + +## Zenoh Topics + +Dora publishes outputs destined for remote nodes on the following zenoh topic names: + +``` +dora/{network_id}/{dataflow_id}/output/{node_id}/{output_id} +``` + +- The `{network_id}` is always `default` for now. In the future, we plan to make it configurable to allow operating multiple separate dora networks within the same WiFi. +- The `{dataflow_id}` is the UUID assigned to the dataflow instance on start. This UUID will be different on every start. (If you want to debug your dataflow using a separate zenoh client and you don't care about the dataflow instance, you can use a `*` wildcard when subscribing.) +- The `{node_id}` is the `id` field of the node specified in the `dataflow.yml`. +- The `{output_id}` is the name of the node output as specified in the `dataflow.yml`.