Event Archival
Archive old events to S3, R2, or MinIO as compressed JSONL.
How it works
As events age past a configurable threshold, Keplor archives them to an S3-compatible object store and deletes them from SQLite to keep the database lean.
| Data | After archival |
|---|---|
Recent events (within archive_after_days) | Stay in SQLite — fully queryable |
| Old events (past threshold) | Compressed JSONL in S3/R2 — deleted from SQLite |
| Daily rollups | Always in SQLite — aggregation queries unaffected |
| Archive manifests | Tracked in SQLite for audit and status |
All query, stats, rollup, and quota endpoints continue working on data that remains in SQLite. The has_archived_data flag in query responses indicates when archived data exists for the queried time range.
Build with S3 support
$ cargo build --release --features mimalloc,s3 Or with Docker:
# Dockerfile already includes mimalloc.
# To add S3, edit the build line:
RUN cargo build --release --locked --target x86_64-unknown-linux-musl \
-p keplor-cli --features mimalloc,s3 Archive lifecycle
Every archive_interval_secs (default 1 hour), Keplor checks whether archival should run based on age and/or database size triggers. When triggered:
- Force rollup for affected days (preserves daily aggregations after deletion)
- Query events older than
archive_after_days, ordered by(user_id, timestamp) - Group by user + day, serialize to JSONL, compress with zstd, upload to S3/R2
- Record manifest in SQLite for audit and tracking
- Delete archived events from SQLite, then VACUUM to reclaim disk space
S3/R2 key format:
{prefix}/user_id={alice}/day=2026-04-15/{archive_id}.jsonl.zstd Each chunk (user + day) is archived independently. If one upload fails, the remaining chunks continue. Failed events stay in SQLite and are retried on the next cycle.
Cloudflare R2
R2 is the recommended choice for most deployments: 10 GB free storage, zero egress fees, S3-compatible API.
- Create a bucket in the Cloudflare dashboard (e.g.
keplor-archive) - Create an R2 API token with Object Read & Write permissions
- Add to
keplor.toml:
[archive]
bucket = "keplor-archive"
endpoint = "https://<account-id>.r2.cloudflarestorage.com"
region = "auto"
access_key_id = "your-r2-access-key"
secret_access_key = "your-r2-secret-key"
prefix = "events"
archive_after_days = 30 AWS S3
[archive]
bucket = "keplor-archive"
endpoint = "https://s3.us-east-1.amazonaws.com"
region = "us-east-1"
access_key_id = "AKIA..."
secret_access_key = "..."
prefix = "events"
archive_after_days = 30 Standard S3 pricing applies. Consider S3 Intelligent-Tiering for infrequently accessed archives.
MinIO (self-hosted)
[archive]
bucket = "keplor-archive"
endpoint = "http://localhost:9000"
region = "us-east-1"
access_key_id = "minioadmin"
secret_access_key = "minioadmin"
path_style = true # required for MinIO
archive_after_days = 30 Any S3-compatible service works: DigitalOcean Spaces, Backblaze B2, Wasabi, etc.
Configuration reference
| Key | Type | Default | Description |
|---|---|---|---|
bucket | string | S3 bucket name (required) | |
endpoint | string | S3 endpoint URL (required) | |
region | string | Region ("auto" for R2, "us-east-1" for AWS) | |
access_key_id | string | Access key (required) | |
secret_access_key | string | Secret key (required) | |
prefix | string | "" | Key prefix in bucket (e.g. "events") |
path_style | bool | false | Path-style addressing (required for MinIO) |
archive_after_days | u64 | 30 | Archive events older than this many days |
archive_after_hours | u64 | 0 | Sub-day archival (hours). Overrides archive_after_days when non-zero. Set to 1 for hourly offload. |
archive_threshold_mb | u64 | 0 | Also archive when SQLite exceeds this size (MB). 0 = age-only. |
archive_batch_size | usize | 10000 | Maximum events per JSONL archive file |
archive_interval_secs | u64 | 3600 | How often the archive loop runs (seconds). Default: 1 hour. |
Archive vs. retention
If archive_after_days is greater than the shortest retention tier's days value, GC will delete events before they can be archived. Keplor warns about this at startup. Always set archive_after_days lower than your shortest tier's retention.
GC & cleanup
Archival runs before GC in the combined loop to prevent data loss. Daily rollups are force-refreshed before deletion, so aggregation queries remain accurate even after events are archived and removed from SQLite.
S3 connectivity is verified at startup. Bad credentials cause an immediate error log and disable archival, rather than silently failing hours later on the first archive cycle.
CLI commands
Archive manually (outside the automatic cycle):
$ keplor archive --config keplor.toml --older-than-days 14 Check archive status:
$ keplor archive_status --config keplor.toml Next steps
Configuration reference — all [archive] fields.
Storage config — max_db_size_mb and other storage settings.
Integration guide — full setup with retention tiers and auth.