Web · Claude · ChatGPT (MCP) · Gemini

Persistent memory
for AI agents.

A working demonstration of shared context between humans and the agents they use.
Built on personal infrastructure.
An exhibition project by Mark Ferraz.

See how it works

Shared memory wherever work happens

Claude

ChatGPT

Gemini

iOS

Android

Claude

ChatGPT

Gemini

iOS

Android

How it works

Captures land in a typed graph: people, decisions, tasks, knowledge, events, projects. Async workers classify each capture, infer relationships, and let stale relevance decay over time. Agents query the same graph over MCP. Humans and agents read from one source. That is the whole demonstration.

Built for one person. Made available to others.

I built this for myself and the agents I work with. The same memory layer is exposed over MCP so other people and other agents can use it.

MCP-native shared memory

Connect Claude, ChatGPT MCP, Gemini, and internal agents to one memory plane. Query context, commit outcomes, and preserve continuity across every tool.

memory-context

memory-recall

memory-commit

memory-relate

memory-brief

memory-forget

Main + subagent delegation

Main agents orchestrate strategy while subagents execute tightly scoped jobs with bounded tools and contextual slices.

Main Agent: plan + route

Subagent A: research scope

Subagent B: implementation scope

Subagent C: validation scope

Skills + tool surfaces

Attach domain skills to agents, gate tools by responsibility, and keep capabilities explicit so delegation remains auditable.

SkillsTool PoliciesMCP EndpointsRole-specific Prompts

Token and context management

Keep prompts lean by routing memory retrieval: fast paths for known patterns, deeper recall only when confidence or novelty requires it.

Deterministic recall for frequent intents

Cached execution for repeated workflows

Agentic retrieval only for complex or novel asks

Memory filters by source, score, and recency

Classification + scoping

Every capture is typed and connected so agents retrieve usable context instead of raw logs. Scope follows project, role, and task intent.

Classify: decision, task, knowledge, event

Relate: people, projects, commitments

Scope: project-tag + role boundaries

Decay: stale context loses weight

One memory across web and mobile

Capture on mobile, monitor on web, and execute through MCP-enabled agents. Same memory graph, same context lineage, everywhere.

Web Control Console

Mobile Capture

MCP Agent Clients

Cross-team Shared Context

Unified trust boundaries

Control plane internals

Architecture for durable human-agent trust

LittleGuy is designed as a Memory Control Plane: structured memory, scoped execution, and portable context across web, mobile, and MCP clients.

Graph + Vector

Memory substrate

Neo4j relationships + pgvector semantic recall in a dual-store architecture.

26 node types

Typed memory

Decisions, tasks, people, events, documents, and more for higher-precision retrieval.

OAuth + PKCE

MCP auth

Token issuance, refresh, and revocation for secure agent-to-memory connectivity.

Latency aware

Runtime discipline

Deterministic, cached, and agentic retrieval modes keep token use and latency under control.

Async memory lifecycle

4 workers

MEMORY_EXTRACT

Parses raw capture into structured objects: entities, facts, commitments, and decisions.

CLASSIFY

Assigns node type and confidence so retrieval can route to relevant context classes.

EDGE_INFER

Connects people, projects, and knowledge over time to maintain relationship-aware recall.

TEMPORAL_DECAY

Reduces stale relevance unless reinforced, keeping memory fresh without manual cleanup.

Typed memory schema

PersonOrganizationProjectTaskDecisionKnowledgeInsightQuestionEventDocumentTopicAreaGoalHabitReflectionMemoryCommitmentResourceQuoteBookmarkMeetingNoteContactThreadTagAction

Governance primitives

Delegation + scoping

Main agents delegate tasks to subagents with explicit boundaries for memory and tools, reducing accidental context bleed.

Skill-aware execution

Skills and tool surfaces align to roles so each agent uses the right capabilities for the right class of work.

Token management

Recall policies prioritize high-signal context first, then expand only when required by novelty or ambiguity.

MCP as the trust transport layer

OAuth 2.0 + PKCE secures client connectivity while scoped tokens, revocation, and policy-aware tool surfaces keep memory access intentional. This is the foundation for reliable collaboration between humans, main agents, and delegated subagents.

Claude.ai MaxChatGPT MCPGeminiWebMobile

Persistent memory for AI agents.