How can we help?
Documentation, answers, and ways to reach us.
Getting Started
Frequently Asked Questions
0Latency is a memory layer for AI agents. Install it, call
.add() to store, call .recall() to retrieve. Sub-100ms recall, always. No configuration needed — temporal dynamics, contradiction detection, knowledge graphs, and context budgets all work automatically. It just works.
Vector databases do similarity search — you put text in, you get similar text back. 0Latency is a complete memory system with temporal dynamics (decay and reinforcement), contradiction detection, context budget management, and structured recall. It's the difference between "find similar text" and "what should this agent know right now?" We handle the entire lifecycle: extraction, storage, scoring, recall, and pruning.
It just works. No latency tuning, no performance configuration, no knobs to turn. Temporal intelligence, proactive recall, context budget management, and graph memory all work automatically on every plan — including free. Mem0 paywalls graph memory at $249/month and still requires you to configure scoring. With 0Latency: install, add, recall. We handle the rest.
Any of them. We provide a REST API plus Python and TypeScript SDKs that integrate with LangChain, CrewAI, AutoGen, or any custom agent framework. If you're using OpenClaw, there's a zero-config skill that handles everything automatically. If your agent can make HTTP requests, it can use 0Latency.
You choose. Self-hosted plans run on your own Supabase/Postgres instance — your data never leaves your infrastructure. Pro and API plans use our hosted infrastructure. Either way, you own your data completely. Export anytime, no lock-in.
Sub-100ms. Always. No configuration needed. Recall is synchronous and optimized for the fast path — you get your memories back before your LLM even starts generating. Extraction happens asynchronously in the background, so storing memories never blocks your agent either.
Gemini Flash 2.0 — fast and cheap, keeping extraction costs negligible. We chose it so you don't have to. Extraction runs asynchronously in the background, so it never adds latency to your agent's responses.
Yes. The free tier includes 10,000 memories, 3 agents, and the full feature set — temporal intelligence, graph memory, contradiction detection, everything. No feature gating. When you're ready to scale, upgrade.
Contact Us
System Status
All systems operational
API Status →