dora-rs/dora - dora - 开源协同云脑生态支撑系统

Commit Graph

Author	SHA1	Message	Date
Philipp Oppermann	4da34d82d9	Merge branch 'main' into event-timestamps	2 years ago
Philipp Oppermann	984ee364e4	Fill in timestamps when sending events from node to daemon	2 years ago
Philipp Oppermann	e0559b031e	Add HLC clock to daemon and to all event types	2 years ago
Philipp Oppermann	482b282d92	Document `dataflow_descriptor` methods	2 years ago
Philipp Oppermann	0bf31b750e	WIP: Add `uhlc` timestamps to all events and update HLC clocks	2 years ago
Philipp Oppermann	8a439c3411	Pass dataflow descriptor to Rust nodes via API	2 years ago
haixuanTao	51b35aa463	Fix doc test by adding imports	3 years ago
haixuanTao	f9d8fac8e5	Adding Rust docstring documentation	3 years ago
Philipp Oppermann	132957ad34	Fix: Don't ignore subscribe results	3 years ago
Philipp Oppermann	8ec45239ce	Send empty map empty arrays to `None` when sending outputs	3 years ago
Philipp Oppermann	1948a45e6d	Copy outputs directly into shared memory in dora runtime Instead of doing an additional copy to send them from the operator thread to the runtime thread. This commit uses the new `allocate_data_sample` and `send_output_sample` methods introduced in `d7cd370`.	3 years ago
Philipp Oppermann	d7cd3708a7	Add `allocate_data_sample` and `send_output_sample` methods to Rust node API This provides an alternative way to send outputs with less lifetime restrictions than the closure-based interface.	3 years ago
Philipp Oppermann	c1544f8257	Allow accessing data multiple times on Python event type	3 years ago
Philipp Oppermann	a308c48f3e	Refactor Rust node API to ensure proper stopping	3 years ago
Philipp Oppermann	abd850c81e	Send drop tokens over separate channel	3 years ago
Philipp Oppermann	ef37caa3ed	Add arrow support for Python operator inputs Reuse same event type as for Python node API.	3 years ago
Philipp Oppermann	e413d38a8a	Validate `arrow::ArrayData` creation	3 years ago
Philipp Oppermann	6688ae3072	Close outputs directly on drop to notify subscribers immediately We don't want to keep the input open until all drop tokens were released for two reasons: - It adds an unnecessary delay. It is already clear that the output is finished, so by reporting it directly receivers can react earlier. - Receivers might be blocked while waiting for new events, which prevents them from sending finished drop tokens. By closing the outputs, they will be unblocked through a new `InputClosed` event, which allows them to send their finished drop tokens. This way, we receive the remaining drop tokens faster in the sender.	3 years ago
Philipp Oppermann	a19a746d5a	Report remaining drop tokens before event stream thread finishes	3 years ago
Philipp Oppermann	59f4dde8e4	Improve tracing output	3 years ago
Philipp Oppermann	bb822a5ff4	Move pending drop token handling to separate function	3 years ago
Philipp Oppermann	0ed4dcf0f6	Handle drop tokens asynchronously instead of waiting for them The Python garbage collection will drop them non-deterministically in the background. Also, the dropping cannot happen while the GIL is held by our wait code.	3 years ago
Philipp Oppermann	446fa7f0dc	Wait for remaining drop tokens before breaking from event stream thread There might still be some pending drop tokens after the receiving end of the event stream was closed. So we don't want to break from the receiver thread directly. Instead we keep it running until the control channel signals that it expects no more drop tokens by closing the `finished_drop_tokens` channel. This happens either when all required drop tokens were received, or because of a timeout.	3 years ago
Philipp Oppermann	a661030c56	Break from drop token wait loop on timeout	3 years ago
Philipp Oppermann	453e03c0e9	Warn if shared memory regions are closed before receiving drop tokens	3 years ago
Philipp Oppermann	9030e729fe	Extract event stream loop into separate function	3 years ago
Philipp Oppermann	983a26f664	Box shared memory instance to reduce size of `Event` enum	3 years ago
Philipp Oppermann	0d0d6630a1	Add TODO	3 years ago
Philipp Oppermann	1fab082717	Use `arrow` crate to send events from Rust to Python without copying See #224 for details.	3 years ago
Philipp Oppermann	2ca97fdc6b	Remove lifetime from `Event` type Don't bind the lifetime of the event to the `next` call anymore. This makes it possible to use the original event in Python, which does not support borrowed data or lifetimes. The drawback is that we no longer have the guarantee that an event is freed before the next call to `recv`.	3 years ago
Philipp Oppermann	06e020472b	Check that dora daemon and node API versions match	3 years ago
haixuanTao	8f05ec79a1	Add attach logic and hot-reloading logic to the cli	3 years ago
haixuanTao	2d76cd5cc8	Add `Reload` event containing `dataflow_id, node_id, op_id` that needs to be reloaded	3 years ago
haixuanTao	c565071140	Change `instrument` level to not pollute user logs	3 years ago
haixuanTao	1e965452d5	Replace opentelemetry with tokio::tracing	3 years ago
haixuanTao	c4794725aa	Fix API description to use workspace test new release for github CI/CD	3 years ago
haixuanTao	38e02df62c	Update rust api to publish it to crates.io	3 years ago
Philipp Oppermann	f2fb2ffab6	Merge pull request #210 from dora-rs/remove-workspace-path Clean up: Remove workspace path	3 years ago
haixuanTao	f1cf0864a4	Remove static path from workspace dependency Removing those path make it easier to move package, reduce path management reduce complexity and make it easier to export crates to `crates.io`	3 years ago
Philipp Oppermann	b4ebbff074	Allocate shared memory in nodes to improve throughput This commit migrates the allocation of shared memory samples from the daemon to the individual nodes. This way, the nodes can prepare the memory themselves when sending outputs. Thus, we avoid the extra roundtrip to the daemon that we used before (the prepare request and reply). The other advantage is that we can now queue shared memory outputs on the TCP socket without waiting for replies from the daemon (like we already do for `Vec`-backed messages).	3 years ago
haixuanTao	70e6d4ce8f	Remove code duplicate for tracing subscriber and use env variable to manage log level. The intent of this commit is to remove the quantity of log that is being pushed to user. This commit removes the redeclaration of a set up tracing methods to centralise the tokio-tracing subscriber within the extension crate. It also add the feature to filter information based on Environment variable. This makes it possible to define the log level for tokio tracing like this: ``` RUST_LOG=debug dora-daemon --run-dataflow dataflow.yml ``` I have also unified the feature flag to make it easier to manage tracing features among the workspace. I did not change the default behaviour of tracing in our crates and therefore by using the command above you should get the same tracing log as before. fix merge conflict generated	3 years ago
Philipp Oppermann	feca83e309	Refactor: Split Rust node API into smaller submodules	3 years ago
Philipp Oppermann	c21b70ae65	Rename submodule	3 years ago
Philipp Oppermann	11e970e268	Refactor: Move `DoraNode` implementation into submodule	3 years ago
Philipp Oppermann	f4f20f084a	Merge pull request #195 from dora-rs/dont-ack-send-message Don't send replies for `SendMessage` requests when using TCP	3 years ago
Philipp Oppermann	7974d1fe9f	Improve warning message when input is not dropped in time	3 years ago
Philipp Oppermann	9bb57ee531	Don't send replies for `SendMessage` requests when using TCP Allows queueing of multiple messages on the TCP socket, which considerably improves throughput.	3 years ago
Philipp Oppermann	2ccc170649	Send all queued incoming events at once on `NextEvent` request Sends a list of all queued events in the reply instead of sending them one-by-one. This way, we avoid some communication overhead.	3 years ago
Philipp Oppermann	00421e2bcd	Send small messages directly without shared memory We currently need an additional request to prepare zero-copy shared memory messages. For small messages, it might be faster to avoid this extra round-trip and send the message directly over TCP. This commit implements support for this. Messages smaller than a threshold (currently set to 4096 bytes) are sent via TCP, while larger messages still use shared memory. This step also enables future optimizations such as queueing output messages in order to improve the throughput.	3 years ago
Philipp Oppermann	085a0723db	Re-add joining of event stream thread, now based on shared ownership and using a timeout Without joining the thread it is killed suddenly by the OS when the executable exits. This causes interrupted connections to the daemon, which leads to some errors in the log messages. This commit fixes this by joining the event stream thread again, with the following improvements: - Instead of joining the thread directly on drop, we now do the joining in a second thread and report the join result over a channel. This allows us to use a timeout when waiting for the join result in the drop implementation, so that we don't block indefinitely. - We now share the ownership of the join handle between the node and the event stream. This way, its drop handler is only run after both instances were dropped. This way we avoid the deadlock that happened when joining the event steam thread before the event stream instance was dropped.	3 years ago

1 2 3 4

184 Commits (bd9706b9589cc4e1046b9b1ae35661fac853d2de)