# dora-record dora data recording using Apache Arrow Parquet. This nodes is still experimental. ## Getting Started ```bash cargo install dora-record --locked ``` ## Adding to existing graph: ```yaml - id: dora-record custom: source: dora-record inputs: image: webcam/image text: webcam/text # You can add any input and it is going to be logged. ``` ## Output Files Format: Parquet file path: `out//.parquet` Columns: - trace_id: String, representing the id of the current trace - span_id: String, representing the unique span id - timestamp_uhlc: u64, representing the timestamp in [Unique Hybrid Logical Clock time](https://github.com/atolab/uhlc-rs) - timestamp_utc: DataType::Timestamp(Milliseconds), representing the timestamp in Coordinated Universal Time. - `` : Column containing the input in its defined format. Example: ```json { "trace_id": "2fd23ddf1b5d2aa38ddb86ceedb55928", "span_id": "15aef03e0f052bbf", "timestamp_uhlc": "7368873278370007008", "timestamp_utc": 1715699508406, "random": [1886295351360621740] } ``` ## merging multiple file We can merge input files using the `trace_id` that is going to be shared when using opentelemetry features. - `trace_id` can also be queried from UI such as jaeger UI, influxDB and so on... - `trace_id` keep tracks of the logical flow of data, compared to timestamp based merging that might not reflect the actual logical flow of data.