Instead of doing an additional copy to send them from the operator thread to the runtime thread.
This commit uses the new `allocate_data_sample` and `send_output_sample` methods introduced in d7cd370.
This commit adds the ability to use the `opentelemetry` feature flag at
runtime by using the env variable `DORA_TELEMETRY`. Using environment
variable reduce the need to do some argument drilling between our
different process as well as making configurable in the YAML at no cost.
Usage:
```bash
docker run -d -p6831:6831/udp -p6832:6832/udp -p16686:16686 jaegertracing/all-in-one:latest
DORA_TELEMETRY=true dora-daemon --run-dataflow dataflow.yml
```
Current limitation:
- The telemetry currently only work with Jaeger on the local network with
default port: 6831 and 6832.
After adding #236 , error of initialization is shadowed by bailing on
the sender channel.
Bailing used to happen here: ccec196234/binaries/runtime/src/lib.rs (L136)
This makes it hard to debug.
This PR will push the error into the sender channel, and report any error
of initialization making it easier to debug.
Now the channel is used without error, but any error of initialization
will bail the runtime here: ccec196234/binaries/runtime/src/lib.rs (L137)
To alievate the unbounded memory growth, we're replacing variable dereferencing
with scoped `GILPool` as described in the pyo3 documentation recently updated. See:
https://github.com/PyO3/pyo3/pull/2864
This commit is an initial draft at fixing #147. The error is due to the
fact that pyo3 has linked the libpython from the compilation and not
trying to use libpython that is available in `LD_LIBRARY_PATH`.
The current only solution to disable the linking is to use the `extension-module` flag.
This requires to make the python `runtime-node` packaged in a python library.
The python `runtime-node` should also be fully compatible with the other operators in case we want hybrid runtime.
The issue that i'm facing is that. Due to the packaging, I have to deal with the `GIL` that is present from the start of
`dora-runtime` node. This however makes the process single threaded wich is impossible.
So, I have to disable it, but when I do, I have a race condition:
```bash
Exception ignored in: <module 'threading' from '/usr/lib/python3.8/threading.py'>
Traceback (most recent call last):
File "/usr/lib/python3.8/threading.py", line 1373, in _shutdown
assert tlock.locked()
AssertionError:
```
The issue is the same as https://github.com/PyO3/pyo3/issues/1274
To fix this issue, I'm going to look again at the different step to make this work.
But this is my current investigation.
* Changing `drop_operator` to `__del__` in `python.rs`
Using the default python destructor function for dora operator simplify
the usability of dora as user does not have to learn as many custom
methods.
This also creates a symmetry with __init__ and reduce the risk of
errors if drop_operator is not present or misspelled.
It also enables the usability of dora operator outside of dora ( for
testing dor example ) and let it be naturally garbage collected.
* fixing clippy and add comments
* fix formatting
Makes it possible to set an URL as operator source. The `dora-runtime` will then try to download the operator from the given URL when the dataflow is started.
* Add a `dora-python-operator` crate to hold utils functions for dora python
* Remove python serialisation and deserialisation from `dora-runtime`
* Update `python` documentation
* add opentelemetry feature to python op
* modify `python-dataflow` to test opentelemetry
* change `timer` topic to `tick`
* adding tracing to `shared_libraries`
* example install refactoring
* Rename `input` to `dora_input` for python ex
Fix python operator error due to non-present metadata.
This commit add an optional argument for metadata as a Python dictionnary.
The Python dictionnary instead of a class gives us more flexibility, at the cost of parsing items and casting types.
Changes the message format from raw data bytes to a higher-level `Message` struct serialized with capnproto. In addition to the raw data, which is sent as a byte array as before, the `Message` struct features `metadata` field. This metadata field can be used to pass open telemetry contexts, deadlines, etc.