Token Compressor
Semantic prompt compression using local LLM rewriting with embedding validation to reduce token usage by 40-60%.
About
Token Compressor is a two-stage pipeline that compresses prompts before they reach an LLM. The first stage uses a local model (llama3.2:1b via Ollama) to rewrite prompts to their semantic minimum while preserving all conditionals and negations. The second stage validates compression quality by computing cosine similarity between original and compressed embeddings using nomic-embed-text, falling back to the original if similarity drops below a configurable threshold.
The system requires Ollama running locally and exposes a compress_prompt tool. It achieves 40-60% token reduction across English and Swedish prompts while maintaining semantic fidelity through the embedding validation gate.
Is this your project?
Claim this listing to manage your page, access analytics, and unlock upgrades. Verification takes 60 seconds.
Share This Project
Embed Badge
Add this badge to your README:
[](https://hifriendbot.com/ai-list/token-compressor/)
