v0.3
libviprs logo

libviprs

/lɪb-ˈvaɪ·pərz/

A pure-Rust, thread-safe image pyramiding engine for the AEC/construction domain.

CI Merge Gate Rust 1.85+ MIT License

Takes blueprint PDFs and images, extracts raster data, optionally geo-references it, and generates tile pyramids (DeepZoom, XYZ, Google Maps) suitable for web-based viewers. Inspired by libvips, built from scratch.

Benchmark

A pure-Rust image pyramiding library is outperforming the C heavyweight it was designed to replace. In head-to-head benchmarks on a 47-megapixel raster — the kind of image AEC firms tile every day from scanned blueprint PDFs — libviprs generates tiles 1.8× to 2.4× faster than libvips while using up to 10.7× less memory. Those aren't typos. And this isn't a rigged comparison: both libraries receive the same raw pixel buffer in process memory, both produce DeepZoom tile pyramids with identical parameters, and both are measured with the same clock. The difference is architectural.

Read the full benchmark →

Features

Quick Start

Try it interactively: the CLI & API generator lets you tick flags and copy a tailored version of this snippet — see the pyramid base setup and toggle features in the generator panel.

use libviprs::{
    extract_page_image,
    BlankTileStrategy,
    EngineBuilder,
    EngineKind,
    FsSink,
    Layout,
    PyramidPlanner,
    TileFormat,
};
use std::path::Path;

// ──────────────────────────────────────────────────────────────────────
// 1. Decode the source.
//    extract_page_image pulls the embedded raster out of a scanned PDF;
//    use libviprs::decode_file for plain image inputs (PNG/JPEG/TIFF).
// ──────────────────────────────────────────────────────────────────────
let raster = extract_page_image(
    Path::new("blueprint.pdf"),  // input path (PDF here; any image works too)
    1,                           // 1-based PDF page number
).unwrap();

// ──────────────────────────────────────────────────────────────────────
// 2. Plan the pyramid.
//    PyramidPlanner computes per-level dimensions, tile counts, and
//    canvas size — no pixels are touched yet.
// ──────────────────────────────────────────────────────────────────────
let planner = PyramidPlanner::new(
    raster.width(),    // source width in pixels
    raster.height(),   // source height in pixels
    256,               // tile size (DeepZoom default; 512 for HiDPI)
    0,                 // pixel overlap between adjacent tiles
    Layout::DeepZoom,  // tile naming convention (also: Xyz, Google)
).unwrap();

let plan = planner.plan();

// ──────────────────────────────────────────────────────────────────────
// 3. Configure where the tiles get written.
//    FsSink writes to a local directory; libviprs also ships
//    ObjectStoreSink (S3) and PackfileSink (.tar/.tar.gz/.zip).
// ──────────────────────────────────────────────────────────────────────
let sink = FsSink::new(
    "output_tiles",  // output directory (created if missing)
    plan.clone(),    // pyramid plan tile paths are derived from
)
.with_format(TileFormat::Png);  // also: TileFormat::Jpeg { quality: u8 } | Raw

// ──────────────────────────────────────────────────────────────────────
// 4. Run the engine.
//    EngineKind::Auto picks monolithic / streaming / mapreduce based on
//    the source kind and any with_memory_budget value supplied.
// ──────────────────────────────────────────────────────────────────────
let result = EngineBuilder::new(
    &raster,  // source raster from step 1
    plan,     // pyramid plan from step 2
    sink,     // tile sink from step 3
)
.with_engine(EngineKind::Auto)                        // auto-select engine
.with_concurrency(4)                                   // worker threads for tile extraction
.with_blank_strategy(BlankTileStrategy::Placeholder)   // collapse uniform tiles into 1-byte placeholders
.run()
.unwrap();

println!(
    "{} tiles across {} levels ({} blank tiles skipped)",
    result.tiles_produced,    // total tile files written
    result.levels_processed,  // number of pyramid levels
    result.tiles_skipped,     // tiles emitted as blank placeholders
);

Modules

ModuleDescription
sourceImage decoding (JPEG, PNG, TIFF) into canonical Raster
pdfPDF parsing (lopdf) and optional rendering (PDFium)
rasterPixel buffer, region views, format normalization
pixelPixel format definitions (Gray8, RGB8, RGBA8, 16-bit)
plannerTile math, level computation, layout generation
resizeDownscaling for pyramid levels
engineMulti-threaded tile extraction with backpressure
streamingStrip-based streaming engine for memory-bounded pyramid generation
streaming_mapreduceParallel MapReduce engine — concurrent strip rendering within a memory budget
sinkTile output (filesystem, memory, slow sink for testing)
geoAffine geo-transform, GCP solving, bounding box
observeProgress events, memory tracking

Streaming Engine

Large raster images — scanned blueprints at 300 DPI, aerial surveys, architectural sheets — can easily exceed available RAM when the monolithic engine materialises the full canvas. The streaming engine solves this by processing the pyramid in horizontal strips, reducing peak memory by orders of magnitude while producing pixel-exact output.

Unlike libvips, which implements a fully lazy demand-driven pipeline where each pixel is computed on demand through a complex DAG of operations, libviprs takes a pragmatic middle ground: strips are processed eagerly within each band, but the full canvas is never materialised. This keeps the pipeline architecture simple and auditable while delivering the memory savings that matter for AEC-scale images.

libvips libviprs (monolithic) libviprs (streaming)
Peak memory O(tile_size²) O(canvas²) O(canvas_w × strip_h)
16820×11888 Google+centre ~1 MB ~5.1 GB ~50 MB
Evaluation model Fully lazy (per-region) Fully eager Semi-lazy (per-strip)
Downscale On-demand per-region Full-level passes Strip passes (same box filter)
Trade-off Complex pipeline plumbing Simple but memory-hungry Middle ground

The caller sets a memory budget; the engine maximises strip height within that budget to balance memory and throughput. When the budget is large enough for the full canvas, the monolithic engine is used automatically — no code changes needed.

For vector PDFs, PdfiumStripSource renders one strip at a time directly from the PDF using PDFium's clipped matrix path (with full /Rotate support), so the full page bitmap is never materialised. BudgetPolicy::Error fails loudly if the chosen budget cannot fit the smallest workable strip; BudgetPolicy::AutoAdjust raises the budget and continues. The engine also pre-flights the budget against canvas geometry before allocating, surfacing a typed BudgetExceeded error rather than running until OOM.

Phase 3 Hardening

Phase 3 adds production-hardening features: durable storage sinks, resumable jobs, retry policies, checksums, deduplication, structured tracing, and versioned manifests. All features are opt-in via Cargo feature flags.

Object-storage sinks (--features s3)

Write tiles directly to an S3-compatible bucket via ObjectStoreSink. The sink runs against any caller-provided ObjectStore backend.

use libviprs::{EngineBuilder, ObjectStoreSink, ObjectStoreConfig, TileFormat};

let sink = ObjectStoreSink::new(store, ObjectStoreConfig::default(), plan.clone())
    .with_format(TileFormat::Png);
EngineBuilder::new(&raster, plan, sink).run().unwrap();

Manifest v1 + versioning

ManifestBuilder produces a ManifestV1 JSON sidecar next to the pyramid root. The manifest carries a schema_version field; readers ignore unknown keys, so older consumers continue working as new fields are added. Attach the builder to an FsSink via with_manifest.

use libviprs::{FsSink, ManifestBuilder, ChecksumAlgo, TileFormat};

let manifest = ManifestBuilder::new(&plan)
    .checksum_algo(ChecksumAlgo::Blake3);
let sink = FsSink::new("output_tiles", plan.clone())
    .with_format(TileFormat::Png)
    .with_manifest(manifest);
// serialises to: {"schema_version":1, "tiles": [...], ...}

Blank-tile tolerance

Use BlankTileStrategy::PlaceholderWithTolerance to treat tiles whose every channel varies by at most max_channel_delta as blank. Useful for slightly noisy scans where pure-uniform detection misses near-white regions.

use libviprs::{BlankTileStrategy, EngineBuilder};

let result = EngineBuilder::new(&raster, plan, sink)
    .with_blank_strategy(BlankTileStrategy::PlaceholderWithTolerance {
        max_channel_delta: 2,
    })
    .run()
    .unwrap();

CLI equivalent: --blank-tolerance 2

Tracing spans (--features tracing)

When the tracing feature is enabled, the engine emits structured spans compatible with any tracing-subscriber, including OpenTelemetry exporters. Span names: libviprs::pipeline, libviprs::level, libviprs::tile.

# Cargo.toml
libviprs = { version = "0.3", features = ["tracing"] }

CLI: --trace-level info (values: error, warn, info, debug, trace).

Resumable jobs

EngineBuilder::with_resume(ResumePolicy) writes a checkpoint file (.libviprs-job.json) at the output root after each tile. A re-run with ResumePolicy::resume() skips already-written tiles; ResumePolicy::verify() re-reads them and asserts their checksums.

use libviprs::{EngineBuilder, ResumePolicy};

EngineBuilder::new(&raster, plan, sink)
    .with_resume(ResumePolicy::resume())
    .run()
    .unwrap();

CLI: --resume / --overwrite / --verify

Retry + failure policies

Tile writes that fail transiently (network blip, lock contention) can be retried automatically. FailurePolicy::RetryThenSkip records the missing tile in the manifest instead of aborting the job.

use libviprs::{EngineBuilder, FailurePolicy, RetryPolicy};

let result = EngineBuilder::new(&raster, plan, sink)
    .with_failure_policy(FailurePolicy::RetryThenFail)
    .with_retry(RetryPolicy { max_attempts: 3, backoff_ms: 200 })
    .run()
    .unwrap();

CLI: --retry-max 3 --retry-backoff 200 --failure-policy retry-then-fail

Checksums

The engine can compute a digest for every tile and embed it in the manifest (EmitOnly) or verify existing digests on re-run (Verify).

use libviprs::{ChecksumAlgo, ChecksumMode, FsSink, TileFormat};

let sink = FsSink::new("output_tiles", plan.clone())
    .with_format(TileFormat::Png)
    .with_checksums(ChecksumAlgo::Blake3)
    .with_checksum_mode(ChecksumMode::EmitOnly);

CLI: --manifest-emit-checksums --checksum-algo blake3

Deduplication

Identical tile content is stored once. The engine tries symlink first, then hardlink, then falls back to a manifest-only reference (for sinks that do not support links, such as S3).

use libviprs::{ChecksumAlgo, DedupeStrategy, EngineBuilder};

// Dedupe only blank placeholder tiles (fast, low overhead)
EngineBuilder::new(&raster, plan.clone(), sink)
    .with_dedupe(DedupeStrategy::Blanks)
    .run()
    .unwrap();

// Dedupe every tile using Blake3 content hashes
EngineBuilder::new(&raster, plan, sink)
    .with_dedupe(DedupeStrategy::All { algo: ChecksumAlgo::Blake3 })
    .run()
    .unwrap();

CLI: --dedupe-blanks / --dedupe-all

Packfile archive sinks (--features packfile)

Archive the entire pyramid into a single file instead of a directory tree. Useful for cold storage, artifact upload, or reproducible builds.

# .tar, .tar.gz, and .zip are all supported
viprs pyramid blueprint.pdf --sink packfile://output.tar.gz

Extended EngineResult metrics

The EngineResult struct now carries detailed I/O and timing counters for integration with monitoring systems.

let result = EngineBuilder::new(&raster, plan, sink).run().unwrap();
println!(
    "read {} bytes, wrote {} bytes, {} retries, peak queue {}, {:?}",
    result.bytes_read, result.bytes_written,
    result.retry_count, result.queue_pressure_peak,
    result.duration,
);

Feature Flags

FeatureDefaultDescription
defaultnoneNo features enabled by default; core engine, filesystem sink, DeepZoom/XYZ/Google layouts, PNG/JPEG/raw encoding always available
pdfiumnoVector PDF rendering via a dynamically-linked libpdfium
pdfium-staticnoSame as pdfium but links statically (larger binary, no runtime dep)
s3noObjectStoreSink against a caller-injected ObjectStore
tracingnoStructured spans via the tracing crate for OpenTelemetry and other subscribers
packfilenoArchive sink: write the pyramid into a .tar, .tar.gz, or .zip file

Enable multiple features in Cargo.toml:

libviprs = { version = "0.3", features = ["s3", "tracing", "packfile"] }

Requirements

Native Dependencies

PDFium is built from source and published as GitHub Releases under libviprs/libviprs-dep. Every release tag ships four archives covering the full matrix of {linux, musl} × {amd64, arm64}, and each archive contains both the shared library (libpdfium.so) and the static archive (libpdfium.a):

ArchivelibcUse when
pdfium-linux-x64.tgzglibcDebian, Ubuntu, RHEL, mainstream distros on x86_64
pdfium-linux-arm64.tgzglibcDebian, Ubuntu, … on aarch64
pdfium-musl-x64.tgzmuslAlpine / musl-based distroless on x86_64
pdfium-musl-arm64.tgzmuslAlpine / musl-based distroless on aarch64

Loading a glibc .so from a musl process — or vice versa — fails at dlopen time, so match the archive libc to the runtime you deploy against. Fully-static musl binaries that cannot dlopen should link libpdfium.a via pdfium-render/static.

The build tooling is documented in full at libviprs-dep/MANUAL.md (man-page-style reference: CLI, options, artifact layout, environment, exit statuses, troubleshooting). A pipeline-level overview lives at libviprs-dep/pdfium/README.md.

Related Crates

Crate / RepoDescription
libviprs-cliCommand-line interface (viprs binary) — CLI reference
libviprs-testsIntegration tests and fixtures
libviprs-depPrebuilt native dependencies (PDFium today) — build manual · releases