Interested in QuestDB use cases?

Learn more

Market Data Time-Series Schema

RedditHackerNewsX
SUMMARY

A market data time-series schema is the logical model used to store ticks, quotes, and order book events keyed by time for trading and analytics. Good schema design preserves market microstructure detail while keeping ingestion, queries, and regulatory replay fast and predictable. It sits between raw exchange feeds and trading, risk, and surveillance applications.

What Is a Market Data Time-Series Schema?

At its core, a market data time-series schema organizes events around a primary timestamp plus identifiers such as symbol, venue, and feed. Each “fact” table captures one microstructure concept: trades, top-of-book quotes, or depth-of-book updates. Columns then encode price, size, side, and other attributes.

Unlike generic time-series metrics, schemas for tick data must handle bursty, irregular arrivals, strict event-time ordering, and lossless reconstruction requirements. This pushes architects toward append-only, time-partitioned tables and compact encodings of symbols and venues. The schema also needs stable keys so downstream systems can replay and correlate with order lifecycle and executions.

Next generation time-series database

QuestDB is an open-source time-series database optimized for market and heavy industry data. Built from scratch in Java and C++, it offers high-throughput ingestion and fast SQL queries with time-series extensions.

Modeling Ticks, Quotes, and Order Books

A common pattern is three families of tables:

Trades: one row per executed trade, with event time, symbol, venue, trade_id, price, size, aggressor side, and flags. This aligns with tick data storage architecture.

Quotes: one row per best bid/ask change, with bid/ask prices and sizes. Top-of-book tables are narrower and queried heavily for benchmarks such as VWAP and arrival price.

Order book: either snapshots or delta events. Event-style schemas store side, level, price, size, and action (add/modify/delete), enabling full replay and microstructure analytics described in order book data storage.

Venue-specific nuances (e.g., auction states, trading status) usually live in adjacent state tables keyed by symbol and time.

Performance, Risk, and Evolution

Performance begins with partitioning on time and indexing or clustering by symbol and venue, as in a dedicated market data time-series database. This makes intraday symbol-range scans efficient and keeps regulatory trade reconstruction feasible even at multi-venue scale.

Risk and surveillance teams depend on schema stability, so evolution must be additive: new columns for extra flags or derived metrics, while preserving existing layouts. Concepts from schema evolution and temporal data modeling apply directly when adding venues, asset classes, or depth representations without breaking legacy consumers.

For hands-on design patterns using real feeds, see Ingesting financial tick data using a time-series database and Ingesting L2 order-book data with multidimensional arrays.

Subscribe to our newsletters for the latest. Secure and never shared or sold.