Consistent histogram buckets

Create histograms with consistent bucket boundaries for distribution analysis. Different approaches suit different data characteristics.

Problem

A fixed bucket size works well for some data but poorly for others. For example, a bucket size of 0.5 produces a nice histogram for BTC trade amounts, but may produce just one or two buckets for assets with smaller typical values.

Solution 1: Fixed bucket size

When you know your data range, use a fixed bucket size:

Histogram with fixed 0.5 bucketsDemo this query
DECLARE @bucket_size := 0.5
SELECT
  floor(amount / @bucket_size) * @bucket_size AS bucket,
  count(*) AS count
FROM trades
WHERE symbol = 'BTC-USDT' AND timestamp IN '$today'
GROUP BY bucket
ORDER BY bucket;

How it works

floor(amount / 0.5) * 0.5

amount / 0.5: Divide by bucket width (1.3 → 2.6)
floor(...): Truncate to integer (2.6 → 2)
* 0.5: Multiply back (2 → 1.0)

Examples:

0.3 → floor(0.6) × 0.5 = 0.0
1.3 → floor(2.6) × 0.5 = 1.0
2.7 → floor(5.4) × 0.5 = 2.5

note

You must tune @bucket_size for your data range. A size that works for one symbol may not work for another.

Solution 2: Fixed bucket count (dynamic size)

To always get approximately N buckets regardless of the data range, calculate the bucket size dynamically:

Always ~50 bucketsDemo this query
DECLARE @bucket_count := 50

WITH raw_data AS (
  SELECT price, amount FROM trades
  WHERE symbol = 'BTC-USDT' AND timestamp IN '$today'
),
bucket_size AS (
  SELECT (max(price) - min(price)) / (@bucket_count - 1) AS bucket_size FROM raw_data
)
SELECT
  floor(price / bucket_size) * bucket_size AS price_bin,
  round(sum(amount), 2) AS volume
FROM raw_data CROSS JOIN bucket_size
GROUP BY 1
ORDER BY 1;

This calculates (max - min) / 49 to create 50 evenly distributed buckets. The CROSS JOIN makes the calculated bucket_size available to each row.

tip

If there are fewer distinct values than requested buckets, or if some buckets have no data, you'll get fewer than 50 results.

Solution 3: Logarithmic buckets

For data spanning multiple orders of magnitude:

Logarithmic buckets for wide value rangesDemo this query
SELECT
  power(10, floor(log(amount))) AS bucket,
  count(*) AS count
FROM trades
WHERE symbol = 'BTC-USDT'
  AND amount > 0.000001 -- optional. Just adding here for easier visualization
  AND timestamp IN '$today'
GROUP BY bucket
ORDER BY bucket;

Each bucket covers one order of magnitude (0.001-0.01, 0.01-0.1, 0.1-1.0, etc.).

Solution 4: Manual buckets

For simple categorical grouping:

Manual category bucketsDemo this query
SELECT
  CASE
    WHEN amount < 0.01 THEN 'micro'
    WHEN amount < 0.1 THEN 'small'
    WHEN amount < 1.0 THEN 'medium'
    ELSE 'large'
  END AS bucket,
  count(*) AS count
FROM trades
WHERE symbol = 'BTC-USDT' AND timestamp IN '$today'
GROUP BY bucket;

Time-series histogram

Track distribution changes over time by combining with SAMPLE BY:

Hourly histogram evolutionDemo this query
DECLARE @bucket_size := 0.5
SELECT
  timestamp,
  floor(amount / @bucket_size) * @bucket_size AS bucket,
  count(*) AS count
FROM trades
WHERE symbol = 'BTC-USDT' AND timestamp IN '$today'
SAMPLE BY 1h
ORDER BY timestamp, bucket;

Problem​

Solution 1: Fixed bucket size​

How it works​

Solution 2: Fixed bucket count (dynamic size)​

Solution 3: Logarithmic buckets​

Solution 4: Manual buckets​

Time-series histogram​