Line Protocol
Line protocol is a text-based data format specifically designed for time-series data ingestion. It provides a compact, human-readable way to represent timestamped measurements with associated tags and fields using a standardized line-oriented syntax.
Understanding line protocol format
Line protocol follows a consistent structure that makes it ideal for time-series data:
measurement,tag_set field_set timestamp
Each line represents a single data point with:
- Measurement name (the metric being recorded)
- Tags (optional key-value pairs for categorization)
- Fields (the actual values being recorded)
- Timestamp (when the measurement occurred)
For example:
cpu,host=server1,region=us-west usage_idle=92.6,usage_user=7.4 1617897240000000000
Next generation time-series database
QuestDB is an open-source time-series database optimized for market and heavy industry data. Built from scratch in Java and C++, it offers high-throughput ingestion and fast SQL queries with time-series extensions.
Key components and syntax rules
Measurements
- Must escape spaces and commas with backslashes
- Cannot start with numbers
- Case-sensitive
Tags
- Optional but valuable for data partitioning
- Always string values
- Sorted alphabetically
- Key-value pairs separated by equals signs
- Multiple tags separated by commas
Fields
- Required (at least one)
- Support multiple data types (strings, floats, integers, booleans)
- Key-value pairs separated by equals signs
- Multiple fields separated by commas
Timestamps
- Optional (server timestamp used if omitted)
- Unix nanoseconds precision
- Must be at the end of the line
Next generation time-series database
QuestDB is an open-source time-series database optimized for market and heavy industry data. Built from scratch in Java and C++, it offers high-throughput ingestion and fast SQL queries with time-series extensions.
Benefits for time-series data ingestion
Performance advantages
- Minimal parsing overhead
- Efficient batch processing
- Reduced network bandwidth compared to JSON
- Optimized for write throughput
Operational benefits
- Human-readable format
- Easy debugging and manual inspection
- Simple to generate from various data sources
- Compatible with common Unix text processing tools
Common use cases
Metrics collection
system,dc=us-east memory_used=10240,memory_free=2048 1617897240000000000
IoT sensor data
sensor,device_id=12345,type=temperature value=22.5,battery=98 1617897240000000000
Financial market data
trade,symbol=AAPL,exchange=NASDAQ price=150.25,volume=1000 1617897240000000000
Best practices
-
Tag selection
- Use tags for commonly queried dimensions
- Avoid high-cardinality tag values
- Keep tag values concise
-
Field organization
- Group related measurements together
- Use consistent field names across measurements
- Choose appropriate numeric types
-
Timestamp precision
- Use consistent timestamp precision
- Consider application requirements
- Balance precision vs storage needs
-
Batch processing
- Group multiple lines for efficient ingestion
- Consider network and system limitations
- Implement appropriate error handling
Line protocol's simplicity and efficiency make it a popular choice for time-series data ingestion, particularly in scenarios requiring high-performance real-time data ingestion and processing.