Instead of doing an additional copy to send them from the operator thread to the runtime thread.
This commit uses the new `allocate_data_sample` and `send_output_sample` methods introduced in d7cd370.
We don't want to keep the input open until all drop tokens were released for two reasons:
- It adds an unnecessary delay. It is already clear that the output is finished, so by reporting it directly receivers can react earlier.
- Receivers might be blocked while waiting for new events, which prevents them from sending finished drop tokens. By closing the outputs, they will be unblocked through a new `InputClosed` event, which allows them to send their finished drop tokens. This way, we receive the remaining drop tokens faster in the sender.
The Python garbage collection will drop them non-deterministically in the background. Also, the dropping cannot happen while the GIL is held by our wait code.
There might still be some pending drop tokens after the receiving end of the event stream was closed. So we don't want to break from the receiver thread directly. Instead we keep it running until the control channel signals that it expects no more drop tokens by closing the `finished_drop_tokens` channel. This happens either when all required drop tokens were received, or because of a timeout.
Don't bind the lifetime of the event to the `next` call anymore. This makes it possible to use the original event in Python, which does not support borrowed data or lifetimes. The drawback is that we no longer have the guarantee that an event is freed before the next call to `recv`.
This commit migrates the allocation of shared memory samples from the daemon to the individual nodes. This way, the nodes can prepare the memory themselves when sending outputs. Thus, we avoid the extra roundtrip to the daemon that we used before (the prepare request and reply). The other advantage is that we can now queue shared memory outputs on the TCP socket without waiting for replies from the daemon (like we already do for `Vec`-backed messages).
The intent of this commit is to remove the quantity of log that is being pushed to user.
This commit removes the redeclaration of a set up tracing methods to centralise
the tokio-tracing subscriber within the extension crate. It also add the
feature to filter information based on Environment variable.
This makes it possible to define the log level for tokio tracing like
this:
```
RUST_LOG=debug dora-daemon --run-dataflow dataflow.yml
```
I have also unified the feature flag to make it easier to manage tracing features among the workspace.
I did not change the default behaviour of tracing in our crates and therefore by
using the command above you should get the same tracing log as before.
fix merge conflict generated
We currently need an additional request to prepare zero-copy shared memory messages. For small messages, it might be faster to avoid this extra round-trip and send the message directly over TCP. This commit implements support for this. Messages smaller than a threshold (currently set to 4096 bytes) are sent via TCP, while larger messages still use shared memory.
This step also enables future optimizations such as queueing output messages in order to improve the throughput.
Without joining the thread it is killed suddenly by the OS when the executable exits. This causes interrupted connections to the daemon, which leads to some errors in the log messages. This commit fixes this by joining the event stream thread again, with the following improvements:
- Instead of joining the thread directly on drop, we now do the joining in a second thread and report the join result over a channel. This allows us to use a timeout when waiting for the join result in the drop implementation, so that we don't block indefinitely.
- We now share the ownership of the join handle between the node and the event stream. This way, its drop handler is only run after both instances were dropped. This way we avoid the deadlock that happened when joining the event steam thread before the event stream instance was dropped.