Handling Large Result Sets

When querying large datasets via PGWire, many PostgreSQL drivers load the entire result set into memory before returning any rows. This causes:

  • Out of memory errors on the client
  • Long wait times before the first row is available
  • Timeouts for very large queries

The solution is to use cursor-based fetching (also called streaming or paging), which retrieves rows in batches.

The problem

By default, most PostgreSQL drivers execute queries like this:

  1. Send query to server
  2. Wait for all rows to be returned
  3. Load everything into client memory
  4. Return results to application

For a query returning 100 million rows, this means loading all 100 million rows into RAM before your application sees a single row.

The solution

Configure your driver to use cursor-based fetching:

  1. Send query to server
  2. Fetch rows in batches (e.g., 10,000 at a time)
  3. Process each batch before fetching the next
  4. Repeat until all rows are consumed

This keeps memory usage constant regardless of result set size.

Language examples

Java (JDBC)

JDBC loads all rows into memory by default. To enable cursor-based fetching:

import java.sql.*;

public class LargeResultSet {
public static void main(String[] args) throws SQLException {
String url = "jdbc:postgresql://localhost:8812/qdb";
Properties props = new Properties();
props.setProperty("user", "admin");
props.setProperty("password", "quest");

try (Connection conn = DriverManager.getConnection(url, props)) {
// REQUIRED: Disable auto-commit to enable cursor mode
conn.setAutoCommit(false);

try (Statement stmt = conn.createStatement()) {
// Set fetch size - rows fetched per batch
stmt.setFetchSize(10000);

try (ResultSet rs = stmt.executeQuery(
"SELECT * FROM large_table")) {
while (rs.next()) {
// Process each row
// Memory usage stays constant
}
}
}
}
}
}

Key points:

  • setAutoCommit(false) is required — cursor mode only works within a transaction
  • setFetchSize(n) controls how many rows are fetched per round-trip
  • Choose fetch size based on row size and available memory (1,000–50,000 is typical)

Choosing a fetch size

The optimal fetch size depends on:

FactorRecommendation
Row size small (< 100 bytes)10,000–50,000 rows
Row size medium (100–1000 bytes)1,000–10,000 rows
Row size large (> 1KB)100–1,000 rows
High-latency networkLarger batches (reduce round-trips)
Memory-constrained clientSmaller batches

Start with 10,000 and adjust based on memory usage and performance.

Common mistakes

Collecting all rows into a list

// BAD: Collects all rows into memory
List<Row> allRows = new ArrayList<>();
while (rs.next()) {
allRows.add(extractRow(rs));
}
// allRows now contains millions of rows in memory
// GOOD: Process rows as they arrive
while (rs.next()) {
processRow(rs); // Stream to file, aggregate, etc.
}

Forgetting to disable auto-commit (Java)

// BAD: Cursor mode won't work
stmt.setFetchSize(10000); // Ignored without setAutoCommit(false)
// GOOD: Disable auto-commit first
conn.setAutoCommit(false);
stmt.setFetchSize(10000);

Using unnamed cursors (Python)

# BAD: Loads all rows into memory
cursor = conn.cursor()
cursor.execute("SELECT * FROM large_table")
rows = cursor.fetchall() # OOM for large tables
# GOOD: Use a named cursor
cursor = conn.cursor(name='streaming_cursor')
cursor.execute("SELECT * FROM large_table")
for row in cursor: # Streams in batches
process(row)

See also