Name

viprs — generate tile pyramids from images and PDFs for web-based viewers

Synopsis

viprs pyramid [OPTIONS] <INPUT> [<OUTPUT>]
viprs info <INPUT>
viprs plan [OPTIONS] <WIDTH_OR_INPUT>
viprs test-image [OPTIONS] <OUTPUT>

# Phase 3 hardening flags (subset)
viprs pyramid input.pdf tiles/ --blank-tolerance 2
viprs pyramid input.pdf --sink s3://bucket/prefix
viprs pyramid input.pdf out/ --resume --retry-max 3 --retry-backoff 200
viprs pyramid input.pdf out/ --manifest-emit-checksums --checksum-algo blake3
viprs pyramid input.pdf out/ --dedupe-blanks
viprs pyramid input.pdf --sink packfile://out.tar.gz
viprs pyramid input.pdf out/ --trace-level info

Description

viprs converts large raster images and PDF documents into multi-resolution tile pyramids suitable for Deep Zoom viewers, slippy-map UIs, and GIS applications. It supports three tile layouts (DeepZoom, XYZ, Google Maps), three image formats (PNG, JPEG, raw), and three engine modes (monolithic, streaming, parallel MapReduce) to balance throughput and memory usage for images of any size.

Commands

pyramid

Generate a tile pyramid from a PDF or image file. This is the primary command — it loads the source, plans the pyramid grid, and writes tiles to disk.

The Rust program below is what viprs pyramid input.pdf tiles/ runs internally with no flags. Click any flag below to see how it slots into this template, or check flags to assemble a complete program in the generator panel at the bottom.

// Pre-generated; the live version reflects flag selections.

Reference test: builder_sink_fs.rs · two_arg_new_defaults_to_png — the canonical minimal builder call.

Arguments

<INPUT>

Path to a PDF, PNG, JPEG, or TIFF file. Use "-" for stdin (pipe a file directly into viprs). PDF inputs are either raster-extracted (default) or vector-rendered (with --render).

<OUTPUT>

Directory where tiles are written. Created automatically. DeepZoom also writes a .dzi manifest alongside.

Tile options

--tile-size <PIXELS>

Width and height of each tile in pixels. Viewers fetch tiles on demand — smaller tiles mean finer-grained loading but more HTTP requests. 256 is the web standard; use 512 for high-DPI displays or when serving from fast storage. Default: 256

--overlap <PIXELS>

Pixel overlap between adjacent tiles (DeepZoom only). Overlap lets viewers blend tile edges for seamless rendering. OpenSeadragon typically uses overlap=1. Not applicable to Google or XYZ layouts. Default: 0

--layout <FORMAT>

Tile naming and directory convention. Different viewers expect different conventions — OpenSeadragon uses DeepZoom, Leaflet uses XYZ, Google Maps uses its own grid. Default: deep-zoom

Values: deep-zoom ({level}/{col}_{row}.ext + .dzi), xyz ({z}/{x}/{y}.ext), google ({z}/{y}/{x}.ext, power-of-2 grids).

--format <TYPE>

Tile image encoding. PNG preserves quality for line drawings; JPEG saves disk for photographs; raw is useful when a downstream pipeline handles encoding. Default: png

Values: png (lossless), jpeg (lossy, smaller), raw (unencoded pixels, fastest).

--quality <1-100>

JPEG quality factor. Only used with --format jpeg. Default: 85

--centre

Centre the image within the tile grid, adding even background padding on all sides. Google Maps layout requires power-of-2 tile grids — without centring, the image sits at (0,0) and padding fills the right/bottom. With centring, padding is distributed evenly so the image appears centred in the viewer.

--skip-blank

Replace blank (uniform-color) tiles with a 1-byte placeholder instead of encoding the full image. Scanned blueprints have large white margins that produce many identical tiles — placeholder files reduce output size dramatically. Consumers detect them by file size and render a solid color.

--blank-tolerance <N>

Treat a tile as blank when every channel in every pixel varies by at most N from the first pixel's value. Extends --skip-blank to near-uniform tiles; useful for slightly noisy scanner output where pure-white detection misses off-white margins. Requires --skip-blank or --dedupe-blanks. Default: 0 (exact match only)

PDF options

--dpi <N>

Resolution for PDF rendering and page-size scaling. Higher DPI produces more pixels (and more tiles). 72 matches libvips' default; use 150–300 for print-quality blueprints. Default: 72

--page <N>

PDF page to extract (1-based). Default: 1

--render

Use PDFium for vector PDF rendering instead of extracting embedded raster images. Scanned blueprints embed a raster image inside the PDF — extraction is fast and lossless. AutoCAD exports and text-heavy PDFs contain vector paths that must be rendered to pixels. Use --render when the PDF is not a simple raster scan.

--match-page-size

After extracting a raster from a PDF, resize it to match the PDF page dimensions at the specified --dpi. Has no effect with --render. Embedded rasters may have a different resolution than the page's declared dimensions — this flag produces output consistent with libvips' default PDF handling.

Engine options

--concurrency <N>

Number of worker threads for tile extraction. Tile extraction is embarrassingly parallel — each tile reads from the source independently. Set to the number of CPU cores for maximum throughput. With --parallel, this controls per-strip tile workers. Default: 0 (single-threaded)

--buffer-size <N>

Maximum tiles buffered between producer threads and the sink. A large buffer smooths out slow disk writes at the cost of higher memory; a small buffer limits peak memory but may cause producers to block. Default: 64

--memory-limit <MB>

Hard memory limit in MB. If the estimated peak memory exceeds this, the command exits before rendering. Acts as a safety net for containerised deployments — prevents the engine from attempting a pyramid that would OOM-kill the process. Combine with --memory-budget to both limit and control memory. Default: 0 (disabled)

--memory-budget <MB>

Soft memory budget for strip-based processing. Switches from the monolithic engine to the streaming engine, which processes the image in horizontal bands. A 16820×11888 blueprint at 300 DPI needs ~5 GB as a monolithic canvas — the streaming engine reduces this to ~50 MB by processing strips. The strip height is maximised within the budget to balance memory and speed.

When set to 0: auto-selects a budget (1/4 of estimated monolithic peak). When omitted: uses the monolithic engine (original behavior).

--parallel

Use the parallel MapReduce engine instead of the sequential streaming engine. Requires --memory-budget. On multi-core systems, the MapReduce engine overlaps strip rendering for higher throughput — the sequential streaming engine processes one strip at a time. Both produce byte-identical output.

Renders multiple strips concurrently (bounded by the budget), then propagates downscale results sequentially. The --concurrency flag controls per-strip tile worker threads.

Output sink options

--sink <URI>

Output sink URI. The default filesystem sink is used when this flag is omitted (the <OUTPUT> positional argument sets the directory). Two additional sink types are available with optional features:

s3://bucket/prefix — write tiles to an S3-compatible object store. Requires --features s3 at compile time and standard AWS credentials in the environment.

packfile://out.tar, packfile://out.tar.gz, packfile://out.zip — archive the pyramid into a single file. Requires --features packfile.

Resumable job options

--resume

Resume an interrupted run. Reads the checkpoint file (.libviprs-job.json) at the output root and skips tiles that were already written. The run completes from the last safe checkpoint. Cannot be combined with --overwrite or --verify.

--overwrite

Start fresh even if a previous run's output or checkpoint exists. All existing tiles and the checkpoint file are removed before the job begins. Default behaviour when none of --resume / --overwrite / --verify is given

--verify

Re-read every tile produced by a previous run and verify its checksum against the checkpoint. Exits with a non-zero status if any tile is missing or corrupt. Does not write new tiles. Requires checksums to have been emitted during the original run (--manifest-emit-checksums).

Retry and failure policy options

--retry-max <N>

Maximum number of attempts for a single tile write before the failure policy is applied. Default: 0 (no retry)

--retry-backoff <MS>

Initial back-off delay in milliseconds between retry attempts. Each subsequent attempt doubles the delay (exponential back-off). Default: 100

--failure-policy <POLICY>

What to do when a tile write exhausts its retries. Default: fail-fast

Values: fail-fast (abort immediately), retry-then-fail (retry up to --retry-max times then abort), retry-then-skip (retry then record the tile as missing in the manifest and continue).

Checksum options

--manifest-emit-checksums

Compute a digest for each tile and embed it in the manifest JSON. Does not verify existing tiles; use --verify for that. Requires --features checksum.

--checksum-algo <ALGO>

Hash algorithm for tile digests. Only used with --manifest-emit-checksums or --verify. Default: blake3

Values: blake3 (faster, 256-bit), sha256 (wider tooling support).

Deduplication options

--dedupe-blanks

Store only one copy of the blank placeholder tile and point all other blank positions at it via a symlink or hardlink. Significantly reduces inode count on images with large uniform margins. Requires --features dedupe.

--dedupe-all

Extend deduplication to every tile, not just blanks. The engine hashes each encoded tile; duplicate content is stored once and the rest are links or manifest references. Implies --manifest-emit-checksums. Requires --features dedupe. Slower than --dedupe-blanks due to per-tile hashing.

Tracing options

--trace-level <LEVEL>

Emit structured tracing spans at the given verbosity level. Spans are named libviprs::pipeline, libviprs::level, and libviprs::tile. Attach any tracing-compatible subscriber (e.g. tracing-opentelemetry) to export spans to Jaeger, Tempo, or another collector. Requires --features tracing at compile time. Default: off

Values: error, warn, info, debug, trace.

Geo-reference options

--geo-origin <LON,LAT>

Geographic coordinate of the top-left pixel as "longitude,latitude".

--geo-scale <SX,SY>

Degrees per pixel as "scale_x,scale_y". Typically positive X (east) and negative Y (south). Together with --geo-origin, defines an affine transform that lets tile viewers overlay the pyramid on a world map.

Generated CLI & Rust code

No flags selected. Check boxes above to assemble a CLI command and a complete Rust program.

CLI command

Equivalent Rust program

Each flag links to a representative integration test in libviprs-tests that exercises the underlying Rust API. Edit the input/output paths in the generated program before running.

info

Show metadata about a PDF or image file without generating tiles. For PDFs, displays page count, dimensions, and embedded image details. For images, displays pixel dimensions and format.

use libviprs::{pdf_info, decode_file};
use std::path::Path;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let path = Path::new("/path/to/your/input.pdf");

    if path.extension().and_then(|e| e.to_str()) == Some("pdf") {
        let info = pdf_info(path)?;
        println!("PDF: {} pages", info.page_count);
        for page in &info.pages {
            println!(
                "  Page {}: {:.1} x {:.1} pts",
                page.page_number, page.width_pts, page.height_pts,
            );
        }
    } else {
        let raster = decode_file(path)?;
        println!(
            "{}x{} {:?}",
            raster.width(), raster.height(), raster.format(),
        );
    }
    Ok(())
}

Reference tests: pdf_ops.rs · pdf_info_reads_page_count · libviprs/src/source.rs · generate_test_raster_dimensions (decode flow).

<INPUT>

Path to the PDF or image file to inspect. Useful before running pyramid to understand source dimensions, plan memory usage, and decide on DPI settings.

plan

Show the tile pyramid plan without generating any tiles. Displays the number of levels, total tiles, and per-level grid dimensions. Useful for capacity planning and understanding how tile parameters affect output.

use libviprs::{Layout, PyramidPlanner};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Plan from explicit dimensions — no source file needed.
    let planner = PyramidPlanner::new(
        16820, 11888,        // width, height
        256,                 // --tile-size
        0,                   // --overlap
        Layout::Google,      // --layout
    )?
    .with_centre(true);      // --centre

    let plan = planner.plan();
    let (canvas_w, canvas_h) = planner.canvas_dimensions();

    println!("Canvas: {}x{}", canvas_w, canvas_h);
    println!("Levels: {}, total tiles: {}",
        plan.level_count(),
        plan.total_tile_count(),
    );
    println!("Estimated peak memory: {:.1} MB",
        planner.estimate_peak_memory() as f64 / (1024.0 * 1024.0));
    Ok(())
}

Reference tests: libviprs/src/planner.rs · total_tile_count_sums_all_levels · streaming_engine.rs · estimate_streaming_memory_reasonable (peak-memory estimation).

<WIDTH_OR_INPUT>

Either a pixel width (when paired with --height) or a path to an image/PDF file to read dimensions from.

--height <PIXELS>

Image height in pixels. Required when the first argument is a number.

--tile-size, --overlap, --layout, --centre

Same as in pyramid. Control how the plan is computed.

--dpi, --page

Used when the input is a PDF to resolve page dimensions to pixels.

test-image

Generate a synthetic RGB8 gradient test image. Useful for benchmarking, testing, and verifying the tile pipeline without needing real image data.

use libviprs::{generate_test_raster, sink::encode_png};
use std::fs;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let raster = generate_test_raster(1024, 1024)?;
    let bytes  = encode_png(&raster)?;
    fs::write("/path/to/output.png", bytes)?;
    Ok(())
}

Reference tests: libviprs/src/source.rs · generate_test_raster_dimensions · libviprs/src/sink.rs · encode_png_gray8.

<OUTPUT>

Output file path (PNG).

--width <PIXELS>

Image width. Default: 1024

--height <PIXELS>

Image height. Default: 1024

Examples

Basic pyramid generation

DeepZoom from an image (default settings)

$ viprs pyramid photo.jpg tiles/

Produces a DeepZoom pyramid at tiles/ with a companion tiles.dzi manifest. Tiles are 256×256 PNG with no overlap. The monolithic engine holds the entire image in memory.

XYZ layout with JPEG tiles

$ viprs pyramid aerial.tiff map_tiles/ \
    --layout xyz --format jpeg --quality 90

Outputs {z}/{x}/{y}.jpeg tiles for Leaflet or Mapbox. JPEG reduces tile size ~5× vs PNG for photographic content.

Google Maps layout with centring

$ viprs pyramid blueprint.png google_tiles/ \
    --layout google --centre

Google layout uses power-of-2 tile grids. --centre places the image in the middle of the canvas with even padding.

DeepZoom with overlap for OpenSeadragon

$ viprs pyramid scan.png viewer_tiles/ \
    --tile-size 512 --overlap 1

512px tiles with 1px overlap. OpenSeadragon uses the overlap to blend adjacent tiles for seamless rendering at high zoom.

PDF inputs

Scanned blueprint (raster extraction)

$ viprs pyramid blueprint.pdf tiles/ --page 2

Extracts the embedded raster image from page 2 of the PDF. Fast and lossless — no rendering needed for scanned documents.

Vector PDF rendering with PDFium

$ viprs pyramid floorplan.pdf tiles/ \
    --render --dpi 300

Renders the vector PDF at 300 DPI using PDFium. Required for AutoCAD exports, text-heavy documents, and PDFs with paths/shapes instead of embedded images.

Match page dimensions after extraction

$ viprs pyramid scan.pdf tiles/ \
    --match-page-size --dpi 150

Extracts the embedded raster, then resizes it to match the PDF page dimensions at 150 DPI. Produces output consistent with libvips' default PDF handling.

Blank tile optimization

Skip blank tiles on sparse images

$ viprs pyramid whiteboard.png tiles/ --skip-blank

Done: 341 tiles, 11 levels, peak memory 12.5 MB, 2.31s (287 blank tiles skipped)

Replaces 287 uniform-color tiles with 1-byte placeholders. Total output drops from ~45 MB to ~8 MB on a typical whiteboard scan.

Multi-threaded execution

Parallel tile extraction (monolithic engine)

$ viprs pyramid large.png tiles/ \
    --concurrency 8 --buffer-size 128

8 worker threads extract tiles in parallel from the full-resolution canvas. Buffer of 128 tiles smooths out disk I/O variance. Best when the image fits comfortably in RAM.

Memory-constrained processing

Streaming engine with explicit budget

$ viprs pyramid huge_scan.png tiles/ \
    --memory-budget 128

Streaming: budget 128.0 MB, strip_height=2048, estimated peak 98.4 MB
Done: 4521 tiles, 15 levels, peak memory 97.2 MB, 14.8s

Processes the image in 2048px-tall horizontal strips, keeping peak memory under 128 MB. The monolithic engine would need ~2 GB for this image.

Auto-select streaming budget

$ viprs pyramid huge_scan.png tiles/ --memory-budget 0

Budget is set to 1/4 of the estimated monolithic peak. The engine auto-selects: monolithic if the image is small, streaming otherwise. Best default for unknown image sizes.

Hard memory limit (safety net)

$ viprs pyramid huge.pdf tiles/ \
    --memory-limit 512 --memory-budget 128 --render --dpi 300

Exits with an error if the estimated monolithic peak exceeds 512 MB, then uses streaming with a 128 MB budget. Double protection for containers with hard memory limits.

Parallel MapReduce engine

MapReduce with concurrent strip rendering

$ viprs pyramid huge_scan.png tiles/ \
    --memory-budget 256 --parallel

MapReduce: budget 256.0 MB, strip_height=2048, 4 in-flight strips, estimated peak 198.7 MB
Done: 4521 tiles, 15 levels, peak memory 195.3 MB, 9.2s

Renders 4 strips concurrently within the 256 MB budget. ~35% faster than sequential streaming on a 4-core machine. Output is byte-identical.

MapReduce with per-strip tile workers

$ viprs pyramid huge_scan.png tiles/ \
    --memory-budget 256 --parallel --concurrency 4

Each strip also uses 4 tile-extraction workers for an additional level of parallelism. Best on systems with 8+ cores and fast storage.

MapReduce with auto budget and blank skipping

$ viprs pyramid blueprint.pdf tiles/ \
    --memory-budget 0 --parallel --skip-blank \
    --layout google --centre

Full-featured pipeline: auto budget, parallel processing, blank tile optimization, Google layout with centring. The engine chooses the optimal strip height and in-flight count.

Geo-referenced tiles

Geo-referenced DeepZoom pyramid

$ viprs pyramid site_plan.png tiles/ \
    --geo-origin "-73.9857,40.7484" --geo-scale "0.00001,-0.00001"

Maps the top-left pixel to (73.9857°W, 40.7484°N) with ~1.1m per pixel. Enables map viewers to overlay the pyramid on a world map.

Inspection and planning

Inspect a PDF file

$ viprs info blueprint.pdf

PDF: blueprint.pdf
Pages: 3
  Page 1: 841.89 x 595.28 pt (A4 landscape)
    Image: 3508 x 2480 px (JPEG, 1234567 bytes)
  Page 2: 841.89 x 595.28 pt
    Image: 3508 x 2480 px (JPEG, 987654 bytes)
  Page 3: 841.89 x 595.28 pt
    (no embedded raster — use --render)

Shows page dimensions, embedded raster details, and whether PDFium rendering is needed.

Preview the pyramid plan

$ viprs plan blueprint.pdf --tile-size 256 --layout google --centre

Shows levels, tile counts, and canvas dimensions without generating any tiles. Use this to estimate output size and pick the right tile parameters.

Plan from explicit dimensions

$ viprs plan 16820 --height 11888 --layout google --centre

Plan using raw pixel dimensions instead of reading from a file.

Test images

Generate a test image for benchmarking

$ viprs test-image gradient.png --width 4096 --height 4096

Creates a 4096×4096 RGB8 gradient image. Useful for benchmarking the tile engine without needing real production images.

Full pipeline test (generate + tile)

$ viprs test-image /tmp/test.png --width 2048 --height 2048
$ viprs pyramid /tmp/test.png /tmp/tiles/ \
    --memory-budget 8 --parallel --concurrency 4 --skip-blank

End-to-end smoke test: create a synthetic image, then tile it with every engine feature enabled.

Stdin input

Pipe an image from another command

$ curl -s https://example.com/scan.png | viprs pyramid - tiles/

Use - as the input to read from stdin. Works with any command that outputs image data.

Object-storage sink

Write pyramid directly to S3

$ viprs pyramid blueprint.pdf --sink s3://my-bucket/blueprints/site-a/ \
    --format png --manifest-emit-checksums

Tiles are streamed to S3 without a local temp directory. Credentials are read from the standard AWS credential chain (AWS_ACCESS_KEY_ID, instance role, etc.). Requires compile-time --features s3.

Packfile archive

Archive a pyramid into a single .tar.gz

$ viprs pyramid scan.tiff --sink packfile://scan_tiles.tar.gz \
    --layout deep-zoom --format png

Outputs a single scan_tiles.tar.gz containing the full tile tree and the .dzi manifest. Requires compile-time --features packfile.

Resumable job

Resume an interrupted run

# First run (interrupted)
$ viprs pyramid huge.pdf tiles/ --manifest-emit-checksums
^C  # killed after 3200/4521 tiles

# Resume from checkpoint
$ viprs pyramid huge.pdf tiles/ --resume --manifest-emit-checksums
Resuming: 3200 tiles already written, 1321 remaining
Done: 4521 tiles, 15 levels, 1321 new tiles written, 6.3s

The checkpoint file tiles/.libviprs-job.json records each completed tile. On resume, only missing tiles are re-rendered.

Verify a completed run

$ viprs pyramid huge.pdf tiles/ --verify
Verified 4521 tiles. All checksums match.

Reads every tile and confirms it matches the stored digest. Non-zero exit if any tile is missing or corrupt.

Retry and failure policy

Retry transient errors, skip permanently failing tiles

$ viprs pyramid blueprint.pdf --sink s3://my-bucket/out/ \
    --retry-max 5 --retry-backoff 500 \
    --failure-policy retry-then-skip

Each failing tile is retried up to 5 times with 500 ms initial back-off. If still failing, the tile is recorded as missing in the manifest and the job continues. Useful for long-running S3 uploads over unreliable links.

Blank-tile tolerance

Collapse near-white tiles on a noisy scan

$ viprs pyramid noisy_scan.pdf tiles/ --skip-blank --blank-tolerance 2

Done: 341 tiles, 11 levels, peak memory 12.1 MB, 2.18s (312 blank tiles skipped)

Tiles where every pixel channel is within ±2 of the background value are treated as blank. Without --blank-tolerance, scanner noise would prevent most margin tiles from matching the exact background colour.

Checksums and deduplication

Emit Blake3 checksums and deduplicate blank tiles

$ viprs pyramid blueprint.pdf tiles/ \
    --manifest-emit-checksums --checksum-algo blake3 \
    --dedupe-blanks

The manifest records a Blake3 digest for each tile. A single copy of the blank placeholder is stored; all other blank tile paths are symlinks to it.

Full deduplication with content hashing

$ viprs pyramid blueprint.pdf tiles/ --dedupe-all

Every tile is hashed after encoding. Duplicate content (common in tiled vector drawings) is stored once. Inode count can drop by 40–60% on typical blueprint pyramids.

Tracing

Emit info-level spans

$ viprs pyramid blueprint.pdf tiles/ --trace-level info

Emits libviprs::pipeline, libviprs::level, and libviprs::tile spans. Wire up a tracing-opentelemetry subscriber before calling viprs from a Rust driver, or export via the OTEL_EXPORTER_OTLP_ENDPOINT environment variable. Requires compile-time --features tracing.

Engine Comparison

Scenario Flags Engine Best for
Small image, fast disk (none) Monolithic Images that fit in RAM
Small image, multi-core --concurrency 8 Monolithic (parallel tiles) Throughput on multi-core
Large image, limited RAM --memory-budget 128 Streaming Memory-constrained containers
Large image, multi-core --memory-budget 256 --parallel MapReduce Fast + memory-bounded
Large image, max throughput --memory-budget 256 --parallel --concurrency 8 MapReduce (full parallel) Dedicated build servers

All engines produce byte-identical output. Choose based on available memory and CPU cores.

Cargo Features

Optional capabilities are gated behind Cargo features. Pass them at build time:

$ cargo install --path . --features s3,tracing,checksum,dedupe
FeatureDefaultEnables
pdfiumno--render flag (dynamic libpdfium)
pdfium-staticno--render flag (static libpdfium, larger binary)
s3no--sink s3://…
tracingno--trace-level
packfileno--sink packfile://…
checksumno--manifest-emit-checksums, --checksum-algo, --verify
dedupeno--dedupe-blanks, --dedupe-all

Exit Status

CodeMeaning
0Success
1Error (invalid args, file not found, memory limit exceeded, engine failure)

See Also

libviprs homeAPI documentationGitHub repositorylibvips (the project that inspired libviprs)