QuestDB 8.1.0 - Parquet, smarter snapshots, improved SAMPLE BY, and more

QuestDB is the world's fast growing time-series database. It offers premium ingestion throughput, enhanced SQL analytics that can power through analysis, and cost-saving hardware efficiency. It's open source and integrates with many tools and languages.

QuestDB 8.1.0 has arrived.

Our prior release introduced JSON extraction.

And now, by popular demand, QuestDB meets Apache Parquet.

In this release, QuestDB's foundation takes another step towards our next-generation architecture. This article will break down three key features, then itemize performance improvements and bugs.

For the full notes including PR links, checkout GitHub.

Parquet read support

Headlining this release is read support for Apache Parquet.

With Parquet support, you can have:

  • timeseries data within Questdb
  • master tables on Parquet files

And then join them with a single query.

To do so, use the read_parquet() function to read Apache Parquet files.

WITH parsed_data AS (
SELECT
timestamp,
json_extract(metadata, '$.location') AS location
FROM
read_parquet('financial_demo.parquet')
ORDER BY
timestamp
) timestamp(timestamp)
SELECT
timestamp,
location,
count()
FROM
parsed_data
SAMPLE BY
1d
FILL(null);

The query above:

  1. Reads a Parquet file named financial_demo.parquet
  2. Extracts the timestamp and location from the JSON metadata column
  3. Orders the data by timestamp
  4. Counts the occurrences of each location for each day
  5. Additionally, if there are any days with no data, those days will be included in the result set with null values

For more information, check out the documentation.

Both JSON support and Parquet reading are new as of the past two weeks. We plan on continuing this pace of innovation to improve and polish these in-demand features. We'll need your help! If it fits your use-case, we'd love to hear from you.

Please note that Parquet support is currently in beta.

Improved SAMPLE BY

Adds new FROM-TO syntax. This syntax is for specifying the output range for a query. In doing so, it enables "fill" data on either end of a result set. This also defines a lower bound for the data.

SAMPLE BY '5d' FROM '2018-01-01' TO '2019-01-01'

In simple terms, WHERE defines what data will be sampled, and FROM defines what shape the output data will have. The changes also adds more parallization to SAMPLE BY, with both FILL(VALUE) and FILL(NULL).

For more information, check out the thorough write-up in the documentation.

Snapshot recovery

Previously, snapshot recovery was driven by the Cairo Snapshot Instance ID property, which had no default value and was generally confusing to use. Snapshot recovery is now triggered by the existence of a trigger file in the QuestDB root directory.

Now, if a file named root/_restore exists and a snapshot is available, it will be restored on startup. The old strategy with Cairo Snapshot Instance IDs remains for backwards compatibility but is no longer recommended.

Additionally, a failure during snapshot recovery aborts QuestDB start. A failure during recovery means the system state is undefined and should not be relied on.

For more information, check out the documentation.

Performance 🚀

In addition, we've got the usual set of performance improvements.

  • Improved ILP ingress latency
  • SQL Performance:
    • Speedup for small transaction writing by 50-100%.
    • Enhanced performance for the like/ilike operator on symbol columns.
    • Speed improvements for regexp_replace(varchar) with simple patterns.
  • Aggregation queries: Aggregation queries are now faster.

Bug Fixes 🐛

And naturally, bugs too.

  • SQL:

    • Fixed spurious "unsupported operation" errors.
    • Corrected wrong results from parallel WHERE and GROUP BY for some function keys.
    • Fixed incorrect results from parallel GROUP BY with a single varchar function key.
    • Addressed issues with some window functions double-counting rows.
  • Core:

    • Prevented table suspension under memory pressure.
    • Fixed Charts when using together with OAuth.

Summary

Our past two releases came in quick succession. We're eager to get feedback and continue apace. If possible, please upgrade at your earliest convenience and let us know your thoughts. For questions and comments, swing by our Community Forum or our public Slack.

We'd love to hear from you.

RedditHackerNewsX
Subscribe to our newsletters for the latest. Secure and never shared or sold.