UltraCompress¶

Extreme compression for large language models.

UltraCompress shrinks transformer language models below the 4-bits-per-weight floor that has stumped every prior open-source method, with zero catastrophic failures on a 6-model head-to-head benchmark.

v0.1 alpha

Pre-compressed reference models are uploading to the Hugging Face Hub throughout April–May 2026. Run uc list for the live catalog at any time.

The compression methods are the subject of pending U.S. patent applications (USPTO 64/049,511 and 64/049,517, filed 2026-04-25). This CLI is the open-source distribution layer; pre-compressed reference models roll out through the sipsalabs organization on the Hugging Face Hub through April–May 2026 — uc list shows the live catalog at any time.

Install¶

pipuvFrom source

pip install ultracompress

uv add ultracompress

git clone https://github.com/sipsalabs/ultracompress.git
cd ultracompress
pip install -e ".[dev]"

60-second quickstart¶

# Browse the official catalog
uc list

# Download a pre-compressed model
uc pull sipsalabs/<model-id>

# Inspect what's in it
uc info ./models/<model-id>

# Run downstream benchmarks
uc bench ./models/<model-id> --tasks hellaswag --limit 500

What's in a pre-compressed artifact¶

Each artifact is a directory with:

model.safetensors — quantized weights in our compressed format
ultracompress.json — provenance manifest (bpw, base model ID, SHA-256 of weights, license, method version)
tokenizer/ — pre-loaded tokenizer matching the base model
LICENSE — the per-model license (research-free or commercial-paid; contact legal@sipsalabs.com)

Why we exist¶

The published methods most teams use (bitsandbytes, GPTQ, AWQ, HQQ) all hit a wall at 4 bits per weight. Below 4 bits they collapse with catastrophic quality loss. We push past the wall:

Method	Bits per weight	Quality retention (cohort median)	Catastrophic failures
bitsandbytes int8	8.000	99.75%	0/6
bitsandbytes NF4	4.000	98.31%	0/6
HQQ 4-bit g64	4.500	97.72%	0/6
UltraCompress 2.8 bpw	2.798	95.63%	0/6
HQQ 3-bit g64	3.500	72.46%	1/6
HQQ 2-bit g64	2.500	3.46%	6/6

Source: 6-model × 8-method × 500-sample head-to-head benchmark on WikiText-103 perplexity ratio, deterministic seed, full SHA-256 verification manifest available under NDA.

Where to go next¶

First time here? Quickstart
Want to understand the methods? Compression methods overview
Need to integrate with your inference stack? Integration guides
Looking for a specific model? Run uc list for the live catalog.
Deploying in a commercial product? Email legal@sipsalabs.com.

Status¶

UltraCompress is in public alpha as of v0.1.0 (April 2026). The CLI is stable for list, pull, info, bench. Self-compression (uc compress <model>) is intentionally not yet shipped — it depends on the patent-pending compression methods being formally protected. Targeted v0.2 release: late Q3 2026.

Stay in touch¶

Website: sipsalabs.com
GitHub: github.com/sipsalabs/ultracompress
Hugging Face: huggingface.co/sipsalabs
PyPI: pypi.org/project/ultracompress
Twitter: @sipsalabs
Email: founder@sipsalabs.com for commercial / partnership inquiries