Link Search Menu Expand Document

rowcompress AM

Row-oriented batch-compressed storage with parallel scan, DELETE/UPDATE support, and multiple compression codecs.

Table of contents
  1. How It Works
    1. Comparison with colcompress
  2. Parallel Scan
  3. DELETE and UPDATE
  4. When to Use rowcompress
  5. Per-Table Options
  6. Example: Audit Log Table
  7. Compression Options

How It Works

rowcompress stores rows in fixed-size batches (default 10,000 rows per batch). Each batch is serialized using the PostgreSQL heap tuple format and compressed as a single unit. Batch metadata (file offset, byte size, first row number, row count) is stored in engine.row_batch.

Table file
├── Batch 1: [header | row offsets | compressed heap tuple data]
├── Batch 2: [header | row offsets | compressed heap tuple data]
└── …

This AM suits append-heavy workloads where compression matters but column projection is not needed — event logs, audit trails, time-series with many columns always queried together.

Comparison with colcompress

Feature rowcompress colcompress
Orientation Row-oriented Column-oriented
Column projection No (reads full rows) Yes (reads only referenced columns)
Vectorized execution No Yes
Chunk-level min/max pruning No Yes
Write latency per row Lower Higher (columnar transposition)
GROUP BY / analytics Slower Much faster
Sequential append + range by insert order Good Good
Scatter reads (random FK lookup) Very slow Full scan (flat latency)
Typical storage savings 2–10× 3–15×

Parallel Scan

rowcompress implements the PostgreSQL parallel scan protocol via atomic batch claiming. Each parallel worker atomically increments a shared counter to claim the next unprocessed batch, decompresses it, and repeats. There is no coordinator or work-distribution step — workers self-schedule in a work-stealing fashion with zero contention on most paths.

-- Standard PostgreSQL parallel knobs apply
SET max_parallel_workers_per_gather = 4;

DELETE and UPDATE

rowcompress supports DELETE and UPDATE via per-batch deleted-row bitmasks stored in engine.row_batch. Deleted rows are masked at read time without rewriting the batch. UPDATE is implemented as delete-then-insert.

DELETE FROM logs WHERE logged_at < now() - interval '90 days';
UPDATE logs SET level = 'WARN' WHERE level = 'WARNING';

Deleted rows are reclaimed during VACUUM (batch rewrite).


When to Use rowcompress

rowcompress is the right choice when:

  • Writes are frequent and low-latency matters — packing rows into batches is cheaper than columnar transposition
  • Queries always select most or all columns — no benefit from column projection
  • Access pattern is sequential — append order is the natural read order (e.g., process batch 1, then batch 2, etc.)
  • Table is effectively write-once — logs, events, immutable records

Avoid rowcompress when:

  • Queries filter by non-sequential keys (user_id, session_id, etc.) — scatter reads are catastrophic (K8: 1min17s for 20k rows vs colcompress 113ms)
  • Heavy GROUP BY / aggregation — no vectorized execution
  • Storage efficiency matters most — colcompress with zstd compresses better

Per-Table Options

SELECT engine.alter_rowcompress_table_set(
    'logs'::regclass,
    batch_size        => 10000,   -- rows per compressed batch (default: 10000)
    compression       => 'zstd',
    compression_level => 5
);

-- Reset to defaults
SELECT engine.alter_rowcompress_table_reset('logs'::regclass, compression => true);

-- Rewrite all batches with current options (e.g. after changing compression)
SELECT engine.rowcompress_repack('logs');

-- Inspect options
SELECT * FROM engine.rowcompress_options WHERE table_name = 'logs';

-- Inspect batches
SELECT * FROM engine.rowcompress_batches WHERE table_name = 'logs' LIMIT 10;

Example: Audit Log Table

CREATE TABLE audit_log (
    id         bigserial,
    logged_at  timestamptz NOT NULL DEFAULT now(),
    user_id    bigint,
    action     text NOT NULL,
    table_name text,
    old_data   jsonb,
    new_data   jsonb,
    ip_address inet
) USING rowcompress;

-- Tune for write-heavy audit workload
SELECT engine.alter_rowcompress_table_set(
    'audit_log'::regclass,
    batch_size        => 10000,
    compression       => 'zstd',
    compression_level => 3  -- low level = fast writes
);

-- Read recent logs (sequential scan, all batches)
SELECT action, COUNT(*)
FROM audit_log
WHERE logged_at > now() - interval '7 days'
GROUP BY action
ORDER BY COUNT(*) DESC;

-- Check storage efficiency
SELECT table_name, total_units, live_rows,
       pg_size_pretty(pg_total_relation_size('audit_log')) AS disk_size
FROM engine.storage_health
WHERE table_name = 'audit_log';

Compression Options

Codec Notes
pglz Always available
lz4 Fast, good for write-heavy. Requires liblz4-dev
zstd Best ratio. Requires libzstd-dev
deflate Middle ground. Requires libdeflate-dev
none No compression