Hoard
Data lifecycle orchestration

Controlled, verifiable data movement across disk, stage, and tape.

Hoard orchestrates data movement across storage tiers: replicate files to secondary storage, stage them into tarballs, archive them to tape, and retrieve them on demand—with operational context tracking, auditable history, file tagging, and capacity utilization metrics.

Built for environments that still care about locality, retention, cost, and recoverability — not just object storage marketing language.

  • Replication targets for secondary copies
  • Stage targets for tarball creation and transport prep
  • Tape archive workflows with libraries, pools, and retrieval
  • Status, job history, audit events, and verification paths

Capabilities

What Hoard is built to do

Hoard is a practical file lifecycle tool rather than a generic “data platform.”

Replicate

Create and track secondary copies in replication targets for protection, locality, or distribution.

  • Path-based workflows
  • Tagging support
  • Queued job execution

Stage (Packaging)

Package files into tarballs on staging targets for export, transport, or space management workflows.

  • Tarball package sizing
  • Compression options
  • Stage metadata inspection

Archive to tape

Manage tape libraries, pools, inventory, stage-to-tape workflows, and tape-based retrieval paths.

  • Library and pool concepts
  • List contents of tapes
  • Direct retrieval indexing

Retrieve

Files can be retrieved from any location Hoard has managed and the most efficient source will be used.

  • Just tell Hoard what you want
  • Data may be retrieved to any location Hoard can access
  • Remembers original origin path

Manage Workflows

Workflows are managed as jobs that may be monitored, diagnosed, canceled, killed, reran, requeued, and traced.

  • Monitor and control jobs
  • Review job history
  • Similar to HPC job scheduling

Verify and Inspect

Hoard maintains a record of any place it has ever placed a file, and that record can be queried for every file.

  • Historical metadata is never removed
  • Verify current status of files
  • Always know where any file has ever been

Auditing and Traceability

Events are tracked and can be queried for audits, investigations, troubleshooting, and more. These may be combined with job traces to provide a complete picture what took place.

  • Tracks who, what, and when
  • Output formats include JSON, CSV, and Columnar

Tagging

All files can be tagged, and these tags may be queried.

  • Tag metrics (counts, capacity, locations, and more)
  • Some output formats (e.g. JSON, CSV) from tag queries may be used as input for external workflows (AI, ML, automation, analysis, and more)

How it works

Workflows

01

Define targets

Configure replication targets, stage targets, tape stage targets, libraries, and pools.

02

Queue work

Submit replication, staging, retrieval, and tape jobs to the queue.

03

Track outcomes

Hoard records locations, tarball membership, job events, and status information for later inspection.

04

Recover deliberately

Retrieve from replication copies, staged tarballs, or tape-backed data when the origin no longer has what you need.

User actions

Examples

Replicate active data

hoard replicate /data/projectA fast-tier --tags prod,analytics

Maintain an additional copy in a defined replication target.

Prepare a transportable archive

hoard stage /data/projectA stage-01 12TB xz

Create staged tarballs with explicit sizing and compression.

Write staged data to tape

hoard archive-tape /data/projectA tape-stage-01 --library projects --pool lto9 --max-bytes 650GB

Push cold data into a managed tape workflow with control over tarball sizing.

Inspect or recover later

hoard status --verify /data/projectA
hoard retrieve-tape /data/projectA --library projects

Check state, then retrieve when the original data path is no longer sufficient.

Display usage metrics with tags

hoard tag metrics prod

View usage metrics for a specified tag across all defined storage targets.