Your Ai Forgets.
CogmemAi Remembers.
Autonomous robots. Self-driving vehicles. Defense systems. Coding assistants. Any Ai system that needs to remember. CogmemAi gives your Ai persistent recall across sessions, devices, users, and teams — and captures knowledge autonomously, even when your Ai forgets to save.
npm install -g cogmemai-mcp && npx cogmemai-mcp setup

The Problem Every Developer Knows
Every new session starts from zero. You re-explain your tech stack, your architecture, your preferences. Built-in memory is a flat file with no search, no structure, no intelligence. Local SQLite files die with your laptop.
Context Loss
Every session starts blank. Your Ai doesn't remember your tech stack, conventions, or past decisions.
Local Tools Break
Local memory tools crash, leak 15GB+ RAM, corrupt databases, and don't follow you to a new machine, a new editor, or a teammate.
No Intelligence
Flat-file memory has no semantic search, no precision reranking, no automatic extraction, no importance ranking, and no ability to learn from itself.
Four-Layer Cognitive Memory
Production-proven Ai memory that actually works. No local databases. No crashes. No setup complexity.
Ai Extraction
Our Ai identifies architecture decisions, preferences, bugs, and patterns from your conversations — automatically.
Semantic Search
Find relevant memories by meaning, not keywords. Ask "how does auth work?" and get your architecture decisions back.
Intelligent Reranking
Every recall runs multiple retrieval strategies and fuses the results. A neural reranker then precision-scores every candidate so the most relevant memory always surfaces first.
Time-Aware Ranking
Recent and important memories surface first. Old, low-priority memories fade naturally. Always relevant context.
Compaction Recovery
When Claude Code compresses your context, CogmemAi detects it and automatically reloads your memories. No lost context, no re-explaining.
The Intelligence Engine + Auto-Skills
CogmemAi gets smarter every time you use it. Your memory doesn’t just store — it learns, connects, anticipates, and teaches your Ai how to behave.
Think Before You Speak
Your Ai checks its memory before every suggestion. Never again hear “let’s try X” when X was already tried, rejected, or completed. Prior context surfaces automatically on every topic.
Precision Reranking
Every recall runs a second-pass reranker that re-scores all candidates for precision, balanced with the initial ranking signal. The most relevant memory always surfaces first.
Self-Improving Recall
Memories that consistently help you rank higher automatically. Ones you never use fade naturally. Your recall quality improves with every session — no manual curation needed.
Auto-Linking Knowledge Graph
Related memories automatically connect when you save them. Your knowledge builds into a web of relationships, not a flat list. Explore connections across your entire project history.
Contradiction Detection
When recalled memories conflict with each other, CogmemAi flags the contradiction. Catch stale or outdated information before it causes problems in your code.
Query Synthesis
Ask a question and get one coherent answer synthesized from all your relevant memories — not just a list of matches. Like asking a teammate who’s read everything.
Context-Aware Ranking
Tell CogmemAi what you’re doing — debugging, planning, reviewing — and it boosts the right types of memories. The right context surfaces at the right time.
Cross-Project Intelligence
Patterns that appear across multiple projects are automatically promoted to global scope. Your best practices follow you everywhere without manual effort.
Auto-Skills
CogmemAi synthesizes your corrections and preferences into behavioral rules that teach your Ai how to work with you. Correct a mistake once — it never happens again.
Mandatory Rules
Define absolute requirements — “NEVER do X”, “ALWAYS do Y” — that surface in every session with no exceptions. Rules bypass all scoring and decay. Your hard constraints are always enforced.
Closed-Loop Learning
Skills track their own confidence. When a skill works, confidence rises. When it doesn’t, it adapts or retires. Your Ai assistant genuinely improves over time — no manual tuning.
Choose Your Mode
Three ways to run CogmemAi — pick the one that fits your workflow. All modes use the same npx cogmemai-mcp setup wizard.
Cloud (Recommended)
The full experience. Semantic search finds memories by meaning and precision-scores every result. Think Before You Speak ensures your Ai checks its memory before every suggestion. The Intelligence Engine auto-links related knowledge, detects contradictions, synthesizes answers, and generates behavioral skills from your patterns. Cross-device sync, team collaboration, and zero local maintenance.
Requires a free API key.
Local
Your data stays on your machine. A free API key is required for registration — like a software license key — but nothing leaves your device. Full-text search (FTS5) for quality recall. Works offline after initial setup. When you’re ready for the full experience, upgrading to cloud — with quantum-safe encryption, semantic search, and the Intelligence Engine — takes one command.
Free API key required. Data stays local. Works offline.
Hybrid
Local speed meets cloud intelligence. Saves to both simultaneously — reads from cloud when available, falls back to local when offline. Unsynced memories push automatically when connectivity returns. Perfect for developers who travel or work on unreliable networks.
Requires a free API key. Works offline with fallback.
What Only CogmemAi Does
95.10% on LongMemEval — top published score on the field’s hardest long-term memory benchmark. 91% on LoCoMo, above human performance (87.9%). 35+ tools, autonomous memory capture, enterprise-grade semantic search, Think Before You Speak proactive recall, self-improving intelligence + auto-skills + mandatory rules, zero maintenance. Not just one score on a test — the most complete Ai memory system available.
Autonomous Memory NEW
Every memory system fails the same way: the Ai has to choose to save, and it doesn’t. CogmemAi captures your work autonomously at the infrastructure level — decisions, file changes, bug fixes, and deployments land in memory without a single prompt. Stop reminding your Ai to remember. It just does.
Compaction Recovery
When Claude Code auto-compacts your context, everything is lost. CogmemAi detects it, saves your session before it happens, and automatically reloads your memories after. It’s like we discovered the cure for Ai amnesia.
One Command Setup
npx cogmemai-mcp setup — that’s it. Choose your mode, enter your key (or skip it for local), and the wizard configures everything. You’re up and running in under 60 seconds. No Python. No Docker. No vector databases.
Zero Local Infrastructure
In cloud mode, there’s nothing running on your machine except a thin MCP client. No background daemons leaking RAM. No Python environments to manage. All intelligence runs server-side. It just works.
Built-In Skill
Includes an Anthropic-standard Skill that teaches Claude best practices for memory management — when to save, importance scoring, session workflows, and more.
Memory Health Score
See how healthy your memory system is at a glance. A 0-100 score with actionable factors so you always know the state of your project knowledge.
Self-Tuning Memory
Memories that matter get promoted automatically. Stale, unused memories fade and archive on their own. Your knowledge base stays clean without manual maintenance.
Session Replay
Every session starts with a summary of what you did last time. Pick up exactly where you left off — no scrolling through old conversations.
Your Memory Follows You Everywhere
Cloud memory means your knowledge isn't trapped in one tool, one model, or one machine.
Switch Editors Freely
Memories created in Claude Code are instantly available in Cursor, Windsurf, Cline, and any MCP-compatible tool. No migration, no export/import.
Model-Agnostic
Switch between Opus, Sonnet, Haiku — or any model your editor supports. Your memories persist regardless of which Ai model is running.
Survives Everything
New laptop? New OS? New editor? Your memories are in the cloud. Log in anywhere and your full project knowledge is waiting.
Perfect for Teams
One developer documents an architecture decision. Every teammate's Ai assistant knows about it instantly.
Shared Project Knowledge
When one developer saves a bug fix, documents a convention, or records a decision, every team member's Ai assistant has that knowledge immediately. No syncing, no merge conflicts, no stale local databases.
Privacy Controls
Sensitive memories are automatically excluded from team sharing. Personal preferences stay personal. Project knowledge is shared. You control the boundary.
Why Cloud Memory Wins
Local memory tools store files on your machine. CogmemAi Cloud is a full intelligence engine. Here’s what you get.
| CogmemAi Cloud | Any Local Memory | Built-in CLAUDE.md | |
|---|---|---|---|
| Benchmark accuracy | 95.10% LongMemEval 91% LoCoMo |
20–40% typical | N/A |
| Retrieval hit rate | 95% | 40–60% | Manual lookup |
| Search method | Vector search + keyword + reranking | Keyword or basic vectors | None |
| Precision reranking | Yes — best result always surfaces first | No | No |
| Setup time | 60 seconds | 15–30 minutes | Manual |
| Autonomous memory capture | Yes — saves without Ai cooperation | No — requires Ai to call save | Manual only |
| Compaction recovery | Automatic | None | None |
| Local dependencies | Node.js only | Python, Docker, vector DB | None |
| Cross-device sync | Yes | No — stuck on one machine | No |
| Team knowledge sharing | Yes | Impossible | No |
| Self-improving recall | Yes — learns what matters | Static | No |
| Auto-linking knowledge graph | Automatic | No | No |
| Contradiction detection | Yes | No | No |
| Auto-skills (closed-loop learning) | Yes | No | No |
| Quantum-safe encryption | Yes | Varies | Plaintext |
| Mandatory rules | Yes | No | No |
| Cross-editor portability | 5+ editors | Usually 1 | Claude only |
| Survives laptop replacement | Yes | Data lost | Data lost |
| Price | Free — Pro $9.99/mo | Free (until it breaks) | Free (200 lines) |
Setup in 60 Seconds
Zero Install (Remote)
Connect directly — no npm, no Node.js, no config files. Just add the remote endpoint to your MCP client:
Endpoint: https://hifriendbot.com/mcp/
Auth: Bearer cm_YOUR_API_KEY
Works with any MCP client that supports Streamable HTTP transport — Claude Desktop, Cursor, and more. Get your free API key to connect.
Or install locally for compaction recovery hooks and offline support:
Run the setup wizard
npx cogmemai-mcp setup
The wizard asks you to choose a mode (Cloud, Local, or Hybrid), then walks you through the rest — API key verification, Claude Code configuration, and compaction recovery hooks.
Get your API key (Cloud & Hybrid only)
Sign up free and generate a key on this page. All modes require a free API key — for local mode it works like a software license key, and your data stays on your machine.
Done
Restart Claude Code. It now has persistent memory with automatic compaction recovery — it remembers your architecture, preferences, and decisions across every session. No prompting needed.
Manual setup (if you prefer not to use the wizard)
Per project — create .mcp.json in your project root:
{
"mcpServers": {
"cogmemai": {
"command": "cogmemai-mcp",
"env": {
"COGMEMAI_API_KEY": "cm_your_key_here"
}
}
}
}
Or global — works in every project:
claude mcp add cogmemai cogmemai-mcp -e COGMEMAI_API_KEY=cm_your_key_here --scope user
Local mode (free API key required, data stays local):
claude mcp add cogmemai cogmemai-mcp -e COGMEMAI_MODE=local --scope user
Manual setup does not install compaction recovery hooks. Run npx cogmemai-mcp setup afterward to add them.
Pricing
Start free. Upgrade when you need more.
Personal
- 1,000 memories
- 500 saves/mo
- 10 projects
- Semantic search
- Priority support
Team
- 10,000 memories
- 5,000 saves/mo
- 50 projects
- Shared team memory
- Priority support
Enterprise
- 50,000 memories
- 20,000 saves/mo
- 200 projects
- Shared team memory
- Priority support
Or pay per operation with USDC on-chain via x402 — no credit card or subscription required.
Frequently Asked Questions
Does CogmemAi see my source code?
No. CogmemAi stores extracted facts — short sentences like "This project uses React with TypeScript" or "Auth uses JWT tokens." Your actual source code never leaves your machine.
What happens when Claude Code compacts my context?
CogmemAi detects compaction automatically and saves a summary of your session before it happens. After compaction, your memories are reloaded so you can keep working without re-explaining anything.
Does it work with Cursor, Windsurf, and other editors?
Yes. CogmemAi works with any MCP-compatible Ai system: Claude Code, Cursor, Windsurf, Cline, Continue, autonomous systems, and more. Compaction recovery hooks are specific to Claude Code, but memory storage and retrieval work everywhere.
If I switch editors, do my memories follow?
Yes. Memories created in Claude Code are instantly available in Cursor, Windsurf, Cline, or any other MCP client. Your knowledge lives in the cloud, not in any specific tool.
Does it work across different Ai models?
Yes. CogmemAi is model-agnostic. Switch between Opus, Sonnet, Haiku, or any model your editor supports. Your memories persist regardless of which model is running.
Why not just use a local SQLite file?
CogmemAi offers local mode with full-text search (FTS5) and your data stays on your machine. A free API key is required for registration — like a software license key — but nothing leaves your device. However, local mode doesn’t include the full Intelligence Engine. Cloud mode — which is quantum-safe and superior in every way — adds semantic search (find memories by meaning), auto-linking knowledge graph, contradiction detection, self-improving recall, auto-skills, query synthesis, cross-device sync, and team collaboration. A local-only file also doesn’t follow you to a new machine, a new editor, or a teammate. Upgrading from local to cloud takes one command.
Can I export my memories?
Yes. You can export all your memories as JSON from the dashboard or via the MCP export_memories tool. You can also import memories from a JSON file. Your data is always yours.
Is there a free tier?
Yes. The free tier includes 500 memories, 500 extractions per month, and 5 projects. No credit card required. Upgrade when you need more.
How is my data secured?
All memories are protected with quantum-safe encryption — built to withstand both today's threats and tomorrow's quantum computers. Your data is encrypted before it’s written and decrypted only when recalled. API keys are cryptographically hashed (irreversible). All traffic is encrypted over HTTPS. Your data is never used for model training and is never shared between users. You can delete everything instantly. We store only extracted facts (short sentences), never raw source code.
Can I pay with crypto instead of a credit card?
Yes. Operations beyond the free tier can be paid per-use with USDC on-chain via the x402 protocol — no credit card or subscription required. When your free limits run out, the API returns a 402 response with USDC payment instructions. Pay on-chain, retry your request, and the operation completes. You can also subscribe via Stripe if you prefer automatic billing.
Quantum-Safe, Enterprise-Grade Security
Quantum-Safe Encryption
Every memory is protected with quantum-safe encryption — designed to resist both classical and quantum-computer attacks. Even in the event of a data breach, your memories are unreadable today and tomorrow. Cloud and local modes are both encrypted.
No Source Code Leaves Your Machine
We store extracted facts (short sentences), never raw code or file contents. API keys are cryptographically hashed (irreversible). All traffic encrypted over HTTPS.
Delete Anytime
Delete all your memories instantly via dashboard or MCP tool. Full data control. Your data is never used for model training.
No Cross-User Sharing
Your memories are yours. We never share data between users or use it for any other purpose. Sensitive data is automatically detected and excluded.
Works Everywhere You Code
One memory system across all your Ai systems. Any tool, any model, any machine, any device.
Stop Re-Explaining Your Codebase
Join developers who never lose context again.
HTTPS REST API
The same memory engine that powers the MCP server is available over plain HTTPS, in any language, from any runtime. Use it when MCP isn’t an option: serverless functions, web frontends, mobile apps, edge runtimes, agents written in Go, Rust, Python, Ruby, or any non-Node stack.
Base URL and authentication
https://hifriendbot.com/wp-json/hifriendbot/v1
Every request needs an API key in the Authorization header. Get one free at /developer/#get-key.
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json
Response format
All responses are JSON. Success responses return the requested data. Errors return:
{
"error": "human-readable message",
"code": "machine-readable code"
}
HTTP status codes follow standard conventions: 200 for success, 400 for bad input, 401 for missing or invalid auth, 402 when usage exceeds the plan limit, 404 when the resource doesn’t exist, 429 for rate limiting, 5xx for backend issues.
End-to-end example: save and recall
The shortest possible save → recall flow, in three languages.
curl:
# save a memory
curl -X POST https://hifriendbot.com/wp-json/hifriendbot/v1/cogmemai/store \
-H "Authorization: Bearer YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"content": "Orders service uses GraphQL only. Mobile latency budget 200ms.",
"memory_type": "decision",
"importance": 9,
"scope": "project",
"project_id": "orders-service"
}'
# recall later (different session, same project)
curl -X POST https://hifriendbot.com/wp-json/hifriendbot/v1/cogmemai/recall \
-H "Authorization: Bearer YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": "what did we decide about the orders service API?",
"project_id": "orders-service",
"limit": 5
}'
Python:
import requests
BASE = "https://hifriendbot.com/wp-json/hifriendbot/v1"
HEADERS = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json",
}
# save
requests.post(f"{BASE}/cogmemai/store", headers=HEADERS, json={
"content": "Orders service uses GraphQL only. Mobile latency budget 200ms.",
"memory_type": "decision",
"importance": 9,
"scope": "project",
"project_id": "orders-service",
})
# recall
result = requests.post(f"{BASE}/cogmemai/recall", headers=HEADERS, json={
"query": "what did we decide about the orders service API?",
"project_id": "orders-service",
"limit": 5,
}).json()
JavaScript / TypeScript (Node 20+):
const BASE = "https://hifriendbot.com/wp-json/hifriendbot/v1";
const headers = {
"Authorization": `Bearer ${apiKey}`,
"Content-Type": "application/json",
};
// save
await fetch(`${BASE}/cogmemai/store`, {
method: "POST",
headers,
body: JSON.stringify({
content: "Orders service uses GraphQL only. Mobile latency budget 200ms.",
memory_type: "decision",
importance: 9,
scope: "project",
project_id: "orders-service",
}),
});
// recall
const result = await fetch(`${BASE}/cogmemai/recall`, {
method: "POST",
headers,
body: JSON.stringify({
query: "what did we decide about the orders service API?",
project_id: "orders-service",
limit: 5,
}),
}).then(r => r.json());
Endpoint reference
All endpoints sit under /cogmemai/ (full path: /wp-json/hifriendbot/v1/cogmemai/...). All POST and PATCH endpoints accept JSON bodies. All GET endpoints take query-string parameters.
Memory lifecycle
| Method | Path | Purpose |
|---|---|---|
POST | /cogmemai/store | Save a memory. Also used for corrections and reminders (pass memory_type: "correction" or "reminder"). |
POST | /cogmemai/recall | Semantic recall, ranked by relevance + recency + importance. |
POST | /cogmemai/smart-recall | Higher-quality recall with reasoning, for complex queries. |
GET | /cogmemai/context | Load top memories for a project (call this on session start). |
GET | /cogmemai/memories | List all memories matching filters (type, category, scope, tag). |
PATCH | /cogmemai/memory/{id} | Update a memory’s content, importance, type, or tags. |
DELETE | /cogmemai/memory/{id} | Delete a memory. |
POST | /cogmemai/memory/{id}/promote | Lift a project-scope memory to global so it applies everywhere. |
Bulk operations
| Method | Path | Purpose |
|---|---|---|
POST | /cogmemai/bulk-delete | Delete many memories by ID list. Body: { "ids": [1,2,3] }. |
POST | /cogmemai/bulk-update | Update many memories in one call. Body: { "updates": [{ "id": 1, ... }] }. |
GET | /cogmemai/export | Export all memories matching a filter (JSON download). |
POST | /cogmemai/import | Bulk-load memories from a previous export or external source. |
Intelligence
| Method | Path | Purpose |
|---|---|---|
POST | /cogmemai/extract | Extract a batch of memories from a conversation transcript. Identifies type, importance, and tags automatically. |
POST | /cogmemai/ingest | Ingest a long document (PDF, markdown, transcript) as memories. Body: { "content": "...", "project_id": "..." }. |
POST | /cogmemai/consolidate | Merge related memories into a single comprehensive summary. Takes up to 30 seconds. |
POST | /cogmemai/generate-skills | Generate Anthropic-style SKILL.md content from accumulated memories. |
POST | /cogmemai/extract-principles | Distill long-standing patterns into reusable principles. Up to 30 seconds. |
Knowledge graph
| Method | Path | Purpose |
|---|---|---|
POST | /cogmemai/memory/{id}/link | Connect this memory to another (related fact, supersedes, contradicts). |
GET | /cogmemai/memory/{id}/links | Get all memories linked to this one, with relationship types. |
GET | /cogmemai/memory/{id}/versions | Version history for a memory (auto-tracked on every PATCH). |
Analytics and feedback
| Method | Path | Purpose |
|---|---|---|
GET | /cogmemai/analytics | Usage patterns, recall frequency, health metrics. |
GET | /cogmemai/usage | Current monthly usage vs plan limits. |
GET | /cogmemai/stale | Memories that haven’t been recalled in N days, candidates for cleanup. |
GET | /cogmemai/tags | List all tags currently in use, with memory counts. |
POST | /cogmemai/feedback | Signal that a recalled memory was useful or not. Improves ranking over time. |
Sessions
| Method | Path | Purpose |
|---|---|---|
POST | /cogmemai/session-summary | Save a wrap-up summary of what was accomplished in a session. Surfaces at next session start. |
Request body essentials
Common fields across /cogmemai/store and recall endpoints:
| Field | Type | Notes |
|---|---|---|
content | string | The memory itself. One or two sentences, self-contained. |
memory_type | string | identity, preference, architecture, decision, bug, dependency, pattern, context, session_summary, task, correction, reminder. Custom types accepted. |
importance | int 1-10 | 10 = core architecture, 1 = trivial. |
scope | string | project (default) or global. team requires Pro plan. |
project_id | string | Scopes the memory to one codebase. Auto-detected from cwd or git remote when called via MCP; pass explicitly via REST. |
tags | string[] | Up to 5 tags, each up to 30 chars. For grouping related memories. |
category | string | frontend, backend, devops, etc. Custom values accepted. |
subject | string | Short label (e.g. auth_system) for organization. |
ttl | string | Auto-expire after this duration (24h, 7d, 30d). |
Rate limits
| Plan | Memories / month | Extractions / month | Projects | Requests / minute |
|---|---|---|---|---|
| Free | 500 | 500 | 5 | 60 |
| Pro | Higher limits | Higher limits | Unlimited | 600 |
| Team / On-prem | Custom | Custom | Unlimited | Custom |
When you exceed a limit, requests return HTTP 402 Payment Required with an upgrade link. See pricing.
SDKs and clients
- MCP server (TypeScript, the recommended path for Claude Code, Claude Agent SDK, Cursor, Windsurf, Cline, Continue):
npm install -g cogmemai-mcp - Direct API / SDK for low-latency deep integration: contact us
- Working examples in JS and Python: quickstart repo
Help
Stuck? Tell us what you’re building. We answer real engineering questions, not just sales.
