When dropping the `DoraNode`, it waits for the remaining drop tokens. This only works if all the dora events were already dropped before. With the Python GC, this is not guaranteed as some events might still be live on the heap (the user might even use them later). In such cases, we waited until we ran into a timeout, which resulted in very long exit times (see https://github.com/dora-rs/dora/issues/598).
This commit fixes this issue by adding a reference-counted copy of the `DoraNode` and `EventStream` to every event given to Python. This way, we can ensure that the underlying `DoraNode` is only dropped after the last event reference has been freed.
Currently having a custom PyEvent make debugging very hard as fields are hidden within the class PyEvent that is defined within Rust Code.
Python user are getting really confused about this obscure class.
This PR transforms the class into a standard python dictionary.
- If input has no data, return an empty `NullArray` instead of an empty byte array.
- If input contains Vec data (instead of shared memory data), use safe `arrow::buffer::Buffer::from_vec` constructor, which is also zero-copy.
Buffer capacity is always 0 if the buffer is allocated externally. Also, we don't want to allocate unneccessary extra space if only parts of a buffer a used (e.g. when slicing).
Instead of doing an additional copy to send them from the operator thread to the runtime thread.
This commit uses the new `allocate_data_sample` and `send_output_sample` methods introduced in d7cd370.