npm i -g hippo-memory · v1.15.0 · MIT

Know what to
forget.

A memory layer for AI agents, modeled on the hippocampus. Decay by default, strength through use, provenance on every memory.

$ npm install -g hippo-memory

Star on GitHub Read the docs

678 stars 9.9k downloads/mo

Works with

Claude Code
Codex
Cursor
OpenClaw
OpenCode
Pi
any MCP client

hippo · ~/projects

$ npm install -g hippo-memory

$ hippo init --scan ~

✓ memory across every git repo under ~

$ hippo remember "deploy failed: forgot migrations" --error

▸ stored · half-life 14d [verified]

$ hippo recall "why did the deploy break"

▸ deploy failed: forgot migrations 0.91 [verified]

▸ run migrations before release 0.74 [observed]

an example session

The problem

Most AI memory saves everything and searches later.

That's storage with semantic search bolted on. It's why your agent kept hitting the same deploy bug last week. And the week before.

The system saw the failure four times. It had no way to know it should remember.

How it works

Memories decay. Retrieval makes them stronger.

The thing brains have been getting right for 500 million years. Hard lessons stick because you used them. Trivia fades because you didn't.

Buffer

New information lands here. Session-only, no decay.

Episodic

Timestamped, decays by default. Retrieval strengthens it; errors stick.

Semantic

Repeated episodes compress into stable patterns. The originals decay.

weak traces are forgotten; repeated ones consolidate during sleep

strength = decay over time, re-strengthened on every recall

7d half-life

Decay by default

Every memory fades on a 7-day half-life. Persistence is earned, not assumed.

+2d / recall

Retrieval strengthens

Use it or lose it. Each recall extends the half-life. Memories you reach for learn to survive.

2x half-life

Errors stick

Tag a failure once. It decays slower and resurfaces every time you walk back into that code.

3+ → 1

Sleep consolidates

On `hippo sleep`, three or more related episodes merge into one semantic pattern. The originals decay; the pattern survives.

Get started

Zero config. It wires itself in.

Install it, point it at your repos, and hippo auto-detects your agent framework and patches the right config file. Next session, your agent just uses it.

Detected and patched automatically

Claude Code
Codex
Cursor
OpenClaw
OpenCode

The only memory layer that installs its own hooks. No manual wiring.

1
$ npm install -g hippo-memory
2
$ hippo init --scan ~

Then hippo sleep runs at session end via the auto-installed hook and consolidates what you learned.

Local-first

Your memory never leaves your machine.

0 outbound HTTP

Proven by a globalThis.fetch spy that throws on call, across the 1000-event ingestion smoke. Not a hardcoded zero.

SQLite on disk

Memories live in a local .hippo/ store you can read, grep, and git-track. No cloud, no account, no telemetry.

1 call to forget

Right-to-be-forgotten is a single API call. Every row carries kind, scope, owner, and provenance.

tenant-safe by default

Multi-tenant keys are scrypt-hashed with an audit log on every mutation. Tenant A cannot see tenant B, proven by a negative test.

And it's not locked to one tool.

Your ChatGPT memories don't travel to Claude; your .cursorrules don't travel to Codex. Hippo is one store behind all of them.

imports from ChatGPTCLAUDE.md.cursorrulesSlackmarkdown

Receipts

Numbers, not adjectives.

every claim links to its source

74% R@5 on LongMemEval BM25 only, no embeddings. 73.8% with hybrid scoring (v0.28). source ↗ 926 tests, real DB Zero mocks. Project rule: no mocked dependencies in tests. source ↗ 0 runtime deps Node 22.5+. SQLite under the hood. Optional embeddings. source ↗ MIT licensed SQLite backbone with markdown mirrors. Git-trackable, human-readable. source ↗

imports from ChatGPTCLAUDE.md.cursorrulesSlackmarkdown

Compare

Forget by default. Earn persistence through use.

The AI-memory category matured fast in 2026. Hippo's take - bio-decay, strengthen-on-use, outcome-weighted half-lives - is one stance among several. The matrix below is a feature snapshot, not a verdict.

Feature	Hippo	MemPalace	Mem0	Basic Memory	gbrain	Zep	Letta	Cognee	Memoria	EverMind
Decay by default	Yes	No	No	No	No	No	No	No	No	No
Retrieval strengthening	Yes	No	No	No	No	No	No	Partial (recall tuning)	No	Partial (Skill Memory distills patterns)
Reward-proportional decay	Yes	No	No	No	No	No	No	No	No	No
Hybrid search (BM25 + embeddings)	Yes	Embeddings + spatial	Embeddings only	No	Yes (vec + rerank + graph)	Yes (graph + vec)	?	Yes (GraphRAG)	Yes (vector + full-text)	Yes (mRAG, multi-modal)
Schema acceleration / knowledge graph	Yes (schema)	No	No	No	Yes (typed KG, self-wiring)	Yes (temporal KG)	No	Yes (auto-ontologies)	No (typed claims)	Yes (hierarchical: user/group/agent)
Conflict detection + resolution	Yes	No	No	No	Yes (eval-surfaced)	Yes (auto-invalidate stale facts)	No	No	Yes (auto-detect + quarantine)	Partial (temporal tracking)
Multi-agent shared memory	Yes	No	No	No	Yes (brain repo, team mounts)	Yes	No (single-agent state)	Yes	Yes (branch/merge across sessions)	Yes (multi-agent coordination)
Transfer scoring	Yes	No	No	No	No	No	No	No	No	No
Outcome tracking	Yes	No	No	No	No	No	No	No	No	Partial (Cases: agent trajectories)
Confidence tiers	Yes	No	No	No	No (typed facts)	No	No	No	No	No
Spatial organization	No	Yes (wings/halls/rooms)	No	No	No	No	No	No	No	No
Lossless compression	No	Yes (AAAK, 30x)	No	No	No	No	No	No	No	No
Cross-tool import (ChatGPT/Claude/Cursor)	Yes	No	No	No	Partial (data sources)	?	No	Partial (28 data sources)	No (Git ops)	Partial (mRAG: PDFs/images/URLs)
Auto-hook install	Yes	No	No	No	No	No	No	No	No	No
MCP server	Yes	Yes	No	No	Yes (stdio + HTTP/OAuth)	Partial (managed)	Yes (via Letta Code)	Yes (first-party Claude/LangGraph)	Yes	?
Zero runtime deps	Yes	No (ChromaDB)	No	No	No (PGLite or PG+pgvector)	No (managed service)	No (Python deps)	No (Python deps)	Yes (single Rust binary)	No (managed + OSS)
LongMemEval (best published)	86.8% R@5 (F13+F9, oracle*)	96.6% raw / 100% reranked R@5	~49-85% R@5	N/A	97.6-97.9% R@5 (s_cleaned*)	N/A (LoCoMo 80.3%)	N/A	N/A	88.78% overall accuracy w/ reader**	83.00% overall** (LoCoMo 93.05%, HaluMem 93.04%)
Git-friendly	Yes	No	No	Yes	Yes	No	No	No	Yes (Git is the model)	?
Framework agnostic	Yes	Yes	Partial	Yes	Yes	Yes	Yes	Yes	Yes	Yes
License	MIT	(open)	Apache-2.0	(open)	MIT	Apache-2.0 (community)	Apache-2.0	MIT (core)	Apache-2.0	Apache-2.0 (OSS) + cloud

* Split-mismatched: Hippo's 86.8% is on longmemeval_oracle (3 sessions per haystack); gbrain's 97.6% is on longmemeval_s_cleaned (~40 sessions per haystack). Different splits, different difficulty. Not directly comparable.

** Different metric: Memoria's 88.78% and EverMind's 83% are reported as overall accuracy with a reader LLM, not retrieval R@5. Higher denominator + LLM helps. Not directly comparable to retrieval-only R@5 numbers above.

Different tools answer different questions. Mem0 and Basic Memory implement "save everything, search later." MemPalace organizes spatially. gbrain, Zep, and Cognee extract typed entities into a knowledge graph. Letta lets the agent edit its own memory blocks. Memoria is Git-style version control over memory. EverMind is self-evolving Skill Memory. Hippo implements "forget by default, earn persistence through use." Complementary takes, not a single-axis ranking.

Source: the full comparison in the README

FAQ

Questions, answered.

Is this just RAG?

No. RAG retrieves from a static corpus; hippo is a memory lifecycle. Memories decay on a half-life, retrieval strengthens them, errors stick, and sleep consolidates repeats into patterns. It forgets by default and earns persistence through use.

Does it need embeddings?

No. Recall runs on BM25 out of the box (74% R@5 on LongMemEval, BM25 only). Embeddings are an optional dependency for hybrid scoring; nothing is required at runtime.

Where does my data go?

Nowhere. Everything is a local SQLite store with markdown mirrors: 0 outbound HTTP on the ingestion smoke, proven by a fetch spy. No cloud, no account, no telemetry.

Which agents does it work with?

hippo init auto-installs hooks for Claude Code, Codex, Cursor, OpenClaw, and OpenCode, and exposes an MCP server for any MCP client (Cursor, Windsurf, Cline, Claude Desktop).

Is it production-ready?

It is MIT-licensed at v1.15.0, with 926 tests against a real database and zero mocks. Multi-tenant isolation is proven by a negative test.

One command. Every repo gets memory.

Zero config. SQLite under the hood, zero runtime deps, works with every CLI agent you have.