The intent of this commit is to remove the quantity of log that is being pushed to user.
This commit removes the redeclaration of a set up tracing methods to centralise
the tokio-tracing subscriber within the extension crate. It also add the
feature to filter information based on Environment variable.
This makes it possible to define the log level for tokio tracing like
this:
```
RUST_LOG=debug dora-daemon --run-dataflow dataflow.yml
```
I have also unified the feature flag to make it easier to manage tracing features among the workspace.
I did not change the default behaviour of tracing in our crates and therefore by
using the command above you should get the same tracing log as before.
fix merge conflict generated
Previouly, if one node failed we would immidiately mark the dataflow as
finished even though other nodes might still be running.
This meant that if one node failed, it would not be possible to stop the
dataflow through the cli as it is marked as finished.
This commit will just warn the user that one node failed but will keep
await that all nodes within the dataflow finishes.
Without this, the TCP connection uses Nagle's algorithm, which delays sending of packets until ACKs are received. This is problematic because most OSs deliberately delay ACKs to reduce overhead. For example, the default ACK delay on Linux is 40ms, which leads to 80ms latency when sending an output.
Throw an error if a daemon cannot reach the coordinator anymore.
This is only an interim solution. In the future, we want to make the daemon more robust and ideally even allow restarts of the coordinator.
The previous connection might be closed already. It's difficult to check for that without sending any actual data, so we just permit re-registering of daemon connections. Previous connections are closed.
We forgot to handle closed control connections in the `handle_requests` loop. So instead of breaking out of that loop, we continued polling the connection for new messages again and again.
This commit is an initial draft at fixing #147. The error is due to the
fact that pyo3 has linked the libpython from the compilation and not
trying to use libpython that is available in `LD_LIBRARY_PATH`.
The current only solution to disable the linking is to use the `extension-module` flag.
This requires to make the python `runtime-node` packaged in a python library.
The python `runtime-node` should also be fully compatible with the other operators in case we want hybrid runtime.
The issue that i'm facing is that. Due to the packaging, I have to deal with the `GIL` that is present from the start of
`dora-runtime` node. This however makes the process single threaded wich is impossible.
So, I have to disable it, but when I do, I have a race condition:
```bash
Exception ignored in: <module 'threading' from '/usr/lib/python3.8/threading.py'>
Traceback (most recent call last):
File "/usr/lib/python3.8/threading.py", line 1373, in _shutdown
assert tlock.locked()
AssertionError:
```
The issue is the same as https://github.com/PyO3/pyo3/issues/1274
To fix this issue, I'm going to look again at the different step to make this work.
But this is my current investigation.