Streaming market data from Arroyo into QuestDB
Introduction
Arroyo is a new stream processing engine that’s gained a lot of attention since its release — and especially after its recent acquisition by Cloudflare. Designed for low-latency, SQL-first stream processing, Arroyo is written in Rust and makes it easy to build streaming data pipelines without the complexity of alternative systems like Apache Flink or Spark.
It’s fast, lightweight, and expressive. And more importantly for us: it speaks SQL.
QuestDB is a high-performance time-series database built for SQL. If you’re using Arroyo for in-stream processing — such as enrichment, validation, or transformation — and need a sink that can power real-time analytics, QuestDB is a natural fit.
The problem: no native QuestDB connector
When I first looked into connecting Arroyo and QuestDB, I considered a few approaches:
Option 1: Kafka + Kafka Connect
Arroyo already supports Kafka sinks, and we have a Kafka Connect connector for QuestDB. Here’s how it might look:
{"connector.class": "io.questdb.kafka.QuestDBSinkConnector","tasks.max": "5","topics": "trades","client.conf.string": "http::addr=http://localhost:9000;","name": "questdb-trades","value.converter": "org.apache.kafka.connect.json.JsonConverter","include.key": false,"key.converter": "org.apache.kafka.connect.storage.StringConverter","table": "trades","symbols": "symbol, side","timestamp.field.name": "ts","value.converter.schemas.enable": true}
This works, especially if you're already using Kafka. But it adds overhead — an extra system to manage, with added CPU and memory usage, increased latency, and a need to debug Kafka Connect itself if anything goes wrong. On top of that, it uses JSON encoding, which is not ideal for throughput.
Option 2: Postgres sink
Arroyo offers PostgreSQL compatibility through Debezium, which also requires Kafka. And again, the format is JSON — which is bulky, and the pipeline is still more complex than it needs to be.
The idea: use Arroyo's webhook + QuestDB’s ILP
Both Arroyo and QuestDB speak SQL. And QuestDB exposes an HTTP endpoint (/write
) that accepts InfluxDB Line Protocol
(ILP).
So I had this idea: could I use Arroyo’s webhook connector to send ILP-formatted data directly to QuestDB?
Turns out: yes! I later found a video from Micah Wylde (creator of Arroyo) where he used this exact approach to send data into InfluxDB. That confirmed my hunch — this could work with QuestDB too.
How ILP works
Here's an example of a line in ILP:
trades,symbol=BTC-USD,side=buy price=39269.98,amount=0.001 1646762637710419000
It consists of:
- Table name:
trades
- Tags (symbols/strings):
symbol=BTC-USD,side=buy
- Fields:
price=...
,amount=...
- Timestamp (optional, in nanoseconds)
You can send many lines in a single HTTP POST to /write
, each separated by \n
.
Full working example
Let’s walk through a working example using Arroyo’s impulse
connector to generate data, and its webhook connector to
send data to QuestDB.
1. Create the table in QuestDB
CREATE TABLE trades (symbol SYMBOL,side SYMBOL,price DOUBLE,amount DOUBLE,ts TIMESTAMP) TIMESTAMP(ts) PARTITION BY DAY;
2. Create the Arroyo webhook sink
CREATE TABLE questdb_sink (value TEXT) WITH (connector = 'webhook',endpoint = 'http://localhost:9000/write',format = 'raw_string');
3. Generate data with Arroyo’s impulse connector
CREATE TABLE impulse WITH (connector = 'impulse',event_rate = '1');
4. Insert formatted ILP lines into QuestDB
INSERT INTO questdb_sinkSELECTARRAY_TO_STRING(ARRAY_AGG(CONCAT('trades,symbol=BTC-USD,side=buy ','price=', RANDOM() * 30000 + 20000, ',','amount=', RANDOM() * 0.01, ' ',CAST(to_timestamp_nanos(NOW()) AS BIGINT))),CHR(10)) AS valueFROM impulseGROUP BY TUMBLE(INTERVAL '5 SECONDS');
This inserts one batch of 5 records every 5 seconds into QuestDB — all done with SQL.
Conclusion
Arroyo and QuestDB pair naturally — both are fast, SQL-first, and easy to integrate.
While there’s no native QuestDB sink in Arroyo today, the webhook connector + ILP over HTTP makes for a clean, dependency-free pipeline. No Kafka, no JSON, no UDFs.
If you're experimenting with Arroyo and want fast, scalable storage for your real-time streams, give this a try — and let us know how it goes!
Explore other integrations of QuestDB with third party tools or learn more about QuestDB at our documentation.