Don't get stale! Fine-tuning ASOF JOIN with TOLERANCE in QuestDB
ASOF JOIN in QuestDB is a powerhouse for joining time-series data, like trades to quotes. Traditionally, it matches a record from one table (e.g., a trade) with the latest record from another table (e.g., a quote) that occurred at or before the trade's timestamp. However, "latest" doesn't always mean "recent enough", potentially leading to joins with stale data.
QuestDB's new TOLERANCE
clause for ASOF JOIN
solves this. By specifying a
maximum time window (e.g., TOLERANCE 1s
), you can ensure matches are not just the
latest, but also relevantly recent.
This provides finer control, prevents misleading joins that return out-of-date information, and can even boost query performance.
An ASOF JOIN refresher
In the world of time-series data - whether it's financial market ticks, IoT sensor readings, or application logs - events are ordered by time, and this chronological order is what gives the data its meaning.
When you need to combine two such time-series datasets, for instance, matching
trades to the prevailing market quotes at the time of execution, you need a
specialized tool. In QuestDB, that tool is ASOF JOIN
.
ASOF JOIN
(short for "as of join") is designed specifically for these
scenarios. Its fundamental logic is: for each row in a "left" table, find the
row in a "right" table that has the latest timestamp less than or equal to the
left table's row's timestamp.
If multiple columns define the match (e.g., a stock symbol), ASOF JOIN
handles
that too, finding the latest match for each key.
Let's consider a classic market data example.
We have a trades
table and a quotes
table:
trades
table:
timestamp | symbol | price | quantity |
---|---|---|---|
2025-06-05T10:00:00.100Z | BTCUSD | 70000 | 1 |
2025-06-05T10:00:00.500Z | ETHUSD | 3800 | 10 |
2025-06-05T10:00:01.200Z | BTCUSD | 70001 | 2 |
quotes
table (NBBO - National Best Bid and Offer):
timestamp | symbol | bid_price | ask_price |
---|---|---|---|
2025-06-05T10:00:00.000Z | BTCUSD | 69999 | 70000 |
2025-06-05T10:00:00.050Z | ETHUSD | 3799 | 3800 |
2025-06-05T10:00:00.800Z | BTCUSD | 70000 | 70001 |
2025-06-05T10:00:00.900Z | ETHUSD | 3800 | 3801 |
A typical ASOF JOIN
to enrich trades with the quotes prevailing "as of" the
trade time would be:
SELECT t.timestamp AS trade_ts,t.symbol,t.price AS trade_price,q.timestamp AS quote_ts,q.bid_price,q.ask_priceFROM trades t ASOF JOIN quotes qON (symbol);
This query joins the two tables together based on their designated (ordering) timestamp, and partitions the join by symbol.
This query would correctly match:
- BTCUSD trade at
...100Z
with BTCUSD quote at...000Z
- ETHUSD trade at
...500Z
with ETHUSD quote at...050Z
- BTCUSD trade at
...01.200Z
with BTCUSD quote at...00.800Z
trade_ts | symbol | trade_price | quote_ts | bid_price | ask_price |
---|---|---|---|---|---|
2025-06-05T10:00:00.100Z | BTCUSD | 70000 | 2025-06-05T10:00:00.000Z | 69999 | 70000 |
2025-06-05T10:00:00.500Z | ETHUSD | 3800 | 2025-06-05T10:00:00.050Z | 3799 | 3800 |
2025-06-05T10:00:01.200Z | BTCUSD | 70001 | 2025-06-05T10:00:00.800Z | 70000 | 70001 |
This is powerful and intuitive for many time-series analyses.
The hidden risk: when "latest" isn't recent enough
The default ASOF JOIN
behavior is robust, but it has a subtlety: it will
always find the latest preceding or exact match, no matter how far back in
time that match might be. In fast-paced environments like financial markets, or
with intermittently reporting sensors, "latest" can sometimes mean "stale".
Imagine our BTCUSD symbol has a liquidity drought, or a data feed issue, and the
quotes
table has a large time gap:
quotes
table (with a gap for BTCUSD):
timestamp | symbol | bid_price | ask_price |
---|---|---|---|
2025-06-05T09:55:00.000Z | BTCUSD | 69990 | 69992 |
2025-06-05T10:00:00.050Z | ETHUSD | 3799 | 3800 |
... | ... | ... | ... |
2025-06-05T10:00:00.900Z | ETHUSD | 3800 | 3801 |
The last BTCUSD quote is 5 minutes old!
Now, if a BTCUSD trade occurs at 2025-06-05 10:00:00.100Z
, the standard
ASOF JOIN
would match it with the BTCUSD quote from 09:55:00.000Z
.
While technically the "latest available before the trade," a 5-minute-old quote in a volatile market is likely irrelevant and could lead to flawed analysis (e.g., incorrect Transaction Cost Analysis - TCA).
This is where the need for more fine-grained control arises. We need to say, "Find the latest quote, but only if it's no older than X."
Precision unleashed: the TOLERANCE clause
QuestDB now introduces the TOLERANCE
clause for ASOF JOIN
to address exactly
this challenge. TOLERANCE
allows you to specify a maximum acceptable time
difference between the timestamp in the left table (t1.ts
) and the matched
timestamp in the right table (t2.ts
).
The syntax is straightforward:
SELECT ...FROM table1 t1ASOF JOIN table2 t2ON (join_keys) TOLERANCE interval_literal;
Or without join keys:
SELECT...FROM table1 t1ASOF JOIN table2 t2 TOLERANCE interval_literal;
TIP
The TOLERANCE
clause is also supported for LT JOIN
using the same syntax.
It is not currently implemented for SPLICE JOIN
.
The core logic becomes this: a record from t1
at t1.ts
will be joined with a
record from t2
at t2.ts
if and only if:
t2.ts <= t1.ts
(standard ASOF condition:t2
is earlier or same time)t1.ts - t2.ts <= tolerance_value
(new condition: the age oft2
's record relative tot1
's is within the allowed tolerance)
The interval_literal
for TOLERANCE
uses familiar QuestDB time unit
qualifiers:
U
: microseconds (e.g.,'500U'
)T
: milliseconds (e.g.,'100T'
)s
: seconds (e.g.,'2s'
)m
: minutes (e.g.,'5m'
)h
: hours (e.g.,'1h'
)d
: days (e.g.,'1d'
)w
: weeks (e.g.,'1w'
)
TOLERANCE in the market: real-world examples
Let's revisit our market data scenario.
Scenario 1: ensuring fresh quotes for trade analysis
We want to join trades to quotes, but only if the quote is no older than 500 milliseconds.
SELECT t.timestamp AS trade_ts,t.symbol,t.price AS trade_price,q.timestamp AS quote_ts,q.bid_price,q.ask_priceFROM trades t ASOF JOIN quotes qON (symbol) TOLERANCE 500T -- Only use quotes within 500ms
Using our trades
table and the quotes
table with the gap:
- BTCUSD trade at
10:00:00.100Z
: The latest BTCUSD quote is at09:55:00.000Z
. The difference (5 minutes
) is far greater than'500T'
. This trade will not find a match for quote data (columnsquote_ts
,bid_price
,ask_price
would beNULL
). - ETHUSD trade at
10:00:00.500Z
: The latest ETHUSD quote is at10:00:00.050Z
. Difference is450ms
, which is<= 500ms
. This trade will be matched.
This highlights a key behavior: if no data is found within the tolerance, the
query acts like a LEFT OUTER JOIN
. The row from the left table (trades) is
always kept, and the columns from the right table (quotes) are filled with NULL
.
This prevents data from your primary table from being dropped just because a
timely match wasn't available.
This behavior is often exactly what's needed: either you get a relevant, recent match, or acknowledge that no such data exists within the acceptable window.
Scenario 2: handling illiquid assets or sparse data
For an asset that trades infrequently, or a sensor that reports sporadically,
quotes or readings might naturally be further apart. TOLERANCE
allows you to
define what "recent enough" means for that specific context.
Imagine analyzing option trades against the underlying stock's price. If the underlying stock is less liquid, you might need to set a wider tolerance:
SELECT ot.timestamp AS option_trade_ts,ot.option_symbol,s.timestamp AS stock_quote_ts,s.bid_price AS stock_bid,s.ask_price AS stock_askFROM option_trades ot ASOF JOIN stock_quotes sON (ot.underlying_symbol = s.symbol)TOLERANCE 5s -- Allow underlying quote to be up to 5 seconds oldWHERE ot.underlying_symbol = 'RARELYTRADEDSTOCK';
This gives you the flexibility to adapt to the data's characteristics while still bounding the staleness you're willing to accept.
Performance: a welcome side effect
Beyond analytical precision, TOLERANCE
can also bring performance benefits.
Internally, ASOF JOIN
often needs to scan backward in the right-hand table
(e.g., quotes
) to find the appropriate record for each row in the left-hand
table (e.g., trades
).
When TOLERANCE
is specified, QuestDB can terminate these backward scans much
earlier. Once it encounters a record in the right table that is older than the
left table's timestamp by more than the tolerance_value
, it knows no
subsequent (even older) record could possibly satisfy the tolerance condition.
This avoids unnecessary processing of data that's too old to be relevant, leading to faster query execution, especially on large datasets or when the tolerance significantly prunes the search space.
đź’ˇ Pro tip: TOLERANCE
is a new keyword!
The introduction of TOLERANCE
makes it a new reserved keyword specifically
within the JOIN
clause. If you previously used TOLERANCE
as an unquoted
table alias for the right-hand table in an ASOF JOIN
, your existing queries
might break.
Old, now problematic query:
-- If 'TOLERANCE' was an alias for 'quotes_archive'SELECT *FROM trades t ASOF JOIN quotes_archive TOLERANCE; -- Parser now expects an interval after TOLERANCE
The Fix: As per standard SQL, if you need to use a keyword as an identifier, enclose it in double quotes:
SELECT *FROM trades t ASOF JOIN quotes_archive "TOLERANCE"; -- "TOLERANCE" is now correctly seen as an alias
This is a small adjustment but crucial for a smooth transition.
Conclusion: ASOF JOIN with confidence and precision
The TOLERANCE
clause is a significant enhancement to QuestDB's ASOF JOIN
capabilities. It empowers users to:
- Increase analytical accuracy: by ensuring that joins only consider data within a relevant time window, preventing misleading conclusions drawn from stale information
- Gain fine-grained control: tailor the definition of "recency" to specific use cases and data characteristics
- Improve query performance: potentially speed up
ASOF JOIN
operations by reducing unnecessary data scanning
For anyone working with time-series data where the recency of joined information
is critical - especially in fields like finance, IoT analytics, and real-time
monitoring - TOLERANCE
provides a new level of precision and reliability.
Dive into the ASOF JOIN documentation to
explore more examples and start using TOLERANCE
in your QuestDB queries
today!