Some applications might not need the data in arrow format. By creating a custom object, we can provide the data both as a classical `PyBytes` and as a zero-copy arrow array.
Don't bind the lifetime of the event to the `next` call anymore. This makes it possible to use the original event in Python, which does not support borrowed data or lifetimes. The drawback is that we no longer have the guarantee that an event is freed before the next call to `recv`.
This commit is an initial draft at fixing #147. The error is due to the
fact that pyo3 has linked the libpython from the compilation and not
trying to use libpython that is available in `LD_LIBRARY_PATH`.
The current only solution to disable the linking is to use the `extension-module` flag.
This requires to make the python `runtime-node` packaged in a python library.
The python `runtime-node` should also be fully compatible with the other operators in case we want hybrid runtime.
The issue that i'm facing is that. Due to the packaging, I have to deal with the `GIL` that is present from the start of
`dora-runtime` node. This however makes the process single threaded wich is impossible.
So, I have to disable it, but when I do, I have a race condition:
```bash
Exception ignored in: <module 'threading' from '/usr/lib/python3.8/threading.py'>
Traceback (most recent call last):
File "/usr/lib/python3.8/threading.py", line 1373, in _shutdown
assert tlock.locked()
AssertionError:
```
The issue is the same as https://github.com/PyO3/pyo3/issues/1274
To fix this issue, I'm going to look again at the different step to make this work.
But this is my current investigation.
* Add a `dora-python-operator` crate to hold utils functions for dora python
* Remove python serialisation and deserialisation from `dora-runtime`
* Update `python` documentation
Changes the message format from raw data bytes to a higher-level `Message` struct serialized with capnproto. In addition to the raw data, which is sent as a byte array as before, the `Message` struct features `metadata` field. This metadata field can be used to pass open telemetry contexts, deadlines, etc.
Adding `next` and `send_output` requires an async threadpool as the
communication Layer defined by the Middleware Layer returns an async Future
stream.
I solve this issue by adding a tokio runtime on a separater thread that is connected with two channels.
One for sending data and one for receiving data.
Those channel are then exposed synchronously to Python. This should not be cause for
concern as channel are really fast.
Looking at Zenoh Python client, they are heavily using `pyo3-asyncio` implementation
of futures to pass Rust futures into Python.
This can be a solution as well, but, from previous experiment, I'm concerned about performance on such
solution. I have experienced that putting futures from Rust into the `asyncio` queue to be slow.
I'm concerned also by mixing `async` and `sync` code in Python, as it might be blocking. This might requires 2 threadpool in Python.
This might seem as heavy overhead for some operations.