THE ARCHITECTURE

Built for the way memory actually works.

Not storage. Not retrieval. A typed knowledge graph that connects people, decisions, and the context around them.

A typed graph, not a pile of text.

RAG / vector search

Finds similar text. Loses the relationship between the meeting, the decision made there, and the person who changed their mind.

Context windows

Resets every session. Every agent starts from zero. No continuity across tools, time, or team.

LittleGuy

A typed knowledge graph where Sarah is a Person, connected to the Decision you made together, which superseded an earlier Commitment. Traversable. Permanent. Agentically queryable.

Two stores. One truth.

Neo4j (structural)

Relationships, traversal, provenance

Every write lands in both

pgvector (semantic)

Embeddings, similarity, fuzzy recall

Ask "what did I decide with Sarah about pricing?" — semantic search finds the neighborhood, graph traversal finds the exact node and its full context.

Three-tier retrieval. Latency-first.

Tier 1 — Template matching< 50ms

Common query patterns resolved deterministically. No LLM needed.

Tier 2 — Cached execution< 20ms

Previously validated retrieval code runs directly. Fast path for repeated patterns.

Tier 3 — Agentic REPLLLM-generated

Novel or complex queries trigger Groq-powered Cypher + JS generation, executed in a sandboxed V8 isolate with an automatic self-repair loop.

Routing flow

Incoming query

↓

Route by confidence + pattern

↓

Tier 1: deterministic template

Tier 2: cached execution

Tier 3: sandboxed agentic REPL

↓

Contextual answer

LLM-generated code runs in a V8 isolate.

When Tier 3 kicks in, LittleGuy generates Cypher and JavaScript to answer complex graph queries. That code runs inside an isolated-vm sandbox — no access to the host, no side effects, deterministic execution. Agentic flexibility without agentic risk.

26 node types. Every capture lands in the right place.

PersonOrganizationProjectTaskDecisionKnowledgeInsightQuestionEventDocumentTopicAreaGoalHabitReflectionMemoryCommitmentResourceQuoteBookmarkMeetingNoteContactThreadTagAction

Auto-classified on ingest. No tagging. No manual sorting. Every captured thought, message, or voice note routed to the correct type by an LLM pass — then connected to its neighbors by the EDGE_INFER worker.

Async Processing Pipeline

4 workers

MEMORY_EXTRACT

Parses raw captures into structured memory objects — entities, facts, decisions.

CLASSIFY

LLM-driven node classification. Routes each memory to the right type in the graph.

EDGE_INFER

Discovers and creates relationships between nodes. Surfaces connections you didn't know existed.

TEMPORAL_DECAY

Runs on a scheduled cadence. Reduces relevance scores on unreferenced nodes. Memory that fades like real memory — unless you reinforce it.

Memory that works without being asked.

Pre-meeting brief

15 minutes before a calendar event, your agent surfaces the last 3 interactions with attendees, any open commitments, and relevant decisions.

Daily briefing

7 AM. Your priorities, your commitments, the context your agents need to start the day.

Cross-agent continuity

Tell one agent something once. Every agent with access to your LittleGuy knows it. Permanently.

Full OAuth 2.0 + PKCE MCP Server

LittleGuy ships a compliant MCP server with full OAuth 2.0 + PKCE authentication. Any MCP-capable client can connect — Claude.ai Max, ChatGPT, or your own agents — with proper token scoping, refresh, and revocation. Not a proxy. A real auth server.

Claude.ai MaxChatGPT MCPOpenClawAny MCP Client