logzip (Rust)

Compress logs before sending to LLM. Powered by Rust & PyO3.
raw log → [logzip compress] → compressed text → LLM (Claude Code / Cursor / API)
Before / After
Raw Log (Uvicorn):
INFO: 127.0.0.1:45678 - "GET /api/v1/status HTTP/1.1" 200 OK
INFO: 127.0.0.1:45679 - "GET /api/v1/status HTTP/1.1" 200 OK
... (100 similar lines) ...
logzip output:
--- PREFIX ---
INFO: 127.0.0.1:
--- LEGEND ---
#0# = - "GET /api/v1/status HTTP/1.1" 200 OK
--- BODY ---
45678 #0#
45679 #0#
...
Typical savings: 52–58% on structured logs (systemd, uvicorn, docker).
Anomalies and unique lines stay uncompressed — visible at a glance in the BODY.
Compression is lossy-semantic by default (sub-second timestamps trimmed, whitespace collapsed — meaning preserved). Use --lossless for a byte-exact roundtrip.
Why use logzip? (RAG & LLM)
When working with logs in LLMs (Claude, GPT, RAG systems), you face two problems:
- Context Limit: Logs are huge. A 10MB log is ~2.5M tokens.
- Noise: 90% of the log consists of repeating
INFO and identical requests that drown out the real error.
logzip is well-suited for RAG pipelines: it compresses the context before sending it to the model, saving money on tokens and increasing answer accuracy by highlighting anomalies.
Performance (7.96 MB Log, ~2M tokens)
Benchmarked on a real 7.96 MB production log.
logzip modes
| Mode | CLI | Time (ms) | Size (KB) | Saved (%) | Output type |
|---|
| fast | --quality fast | ~200 | ~4,900 | ~40% | text/LLM |
| balanced | --quality balanced | 404 | 3,928 | 52% | text/LLM |
| balanced + 2 passes ★ | --quality balanced --bpe-passes 2 | 418 | 3,404 | 58% | text/LLM |
| max | --quality max | ~1,600 | ≤ 3,404 | ≥ 58% | text/LLM |
★ Recommended default. A second compression pass finds repeated token sequences in already-compressed text — 14 ms overhead, 7% more savings vs balanced.
--quality max runs an auto-search over several (legend, passes) configurations and returns the smallest output. It includes (128, 2) — the recommended config — in its grid, so it can never lose to balanced --bpe-passes 2, but it costs ~4× the time. Use it when you want the best ratio and runtime doesn't matter; otherwise stick with the recommended default. An explicit --bpe-passes N disables the search and pins a single config.
vs. binary compressors (for context)
| Tool | Time (ms) | Size (KB) | Saved (%) | LLM-readable? |
|---|
| lz4 | 6 | 1,280 | 84% | No |
| zstd (lvl 3) | 14 | 819 | 90% | No |
| zlib (lvl 6) | 69 | 840 | 90% | No |
| logzip (recommended) | 418 | 3,404 | 58% | Yes |
Binary compressors produce opaque binary blobs — LLMs cannot read them. logzip trades ~30% size for fully human- and LLM-readable output.
Token estimation: 1 token ≈ 4 characters (rough estimate for English-like logs).
Economic Impact
┌──────────────────────────────────────────────────────────┐
│ logzip Savings (7.96 MB Production Log) │
├──────────────────────────────────────────────────────────┤
│ Raw Size: 8,151 KB (~1,990,000 tokens) │
│ After balanced: 3,928 KB (~959,000 tokens, -52%) │
│ After 2 passes: 3,404 KB (~831,000 tokens, -58%) │
├──────────────────────────────────────────────────────────┤
│ Cost Before: $5.97 │
│ Cost After: $2.49 (Claude 3.5 Sonnet Input) │
│ LLM Efficiency: 2.4x larger context for the same price │
└──────────────────────────────────────────────────────────┘
Install
Python API + logzip-py CLI:
pip install logzip
Rust CLI + MCP Server:
cargo install logzip
CLI
Two CLIs are available. Both provide compress and decompress subcommands with identical flags.
Rust binary (cargo install logzip → logzip):
# stdin → stdout
logzip compress < app.log
# quality preset (fast|balanced|max)
logzip compress --quality balanced < app.log
# recommended: balanced + second pass
logzip compress --quality balanced --bpe-passes 2 < app.log
# with preamble (LLM decode instructions at the top)
logzip compress --preamble < app.log > compressed.txt
# save + show stats
logzip compress --stats -i app.log -o app.logzip