LittleGuy Logo
LittleGuy
How it worksUnder the hood
Log in
How it worksUnder the hood
Log in
Web · Claude · ChatGPT (MCP) · Gemini

Persistent memory
for AI agents.

A working demonstration of shared context between humans and the agents they use.
Built on personal infrastructure.
An exhibition project by Mark Ferraz.

See how it works

Shared memory wherever work happens

Claude
ChatGPT
Gemini
iOS
Android
Claude
ChatGPT
Gemini
iOS
Android

How it works

Captures land in a typed graph: people, decisions, tasks, knowledge, events, projects. Async workers classify each capture, infer relationships, and let stale relevance decay over time. Agents query the same graph over MCP. Humans and agents read from one source. That is the whole demonstration.

Built for one person. Made available to others.

I built this for myself and the agents I work with. The same memory layer is exposed over MCP so other people and other agents can use it.

MCP-native shared memory

Connect Claude, ChatGPT MCP, Gemini, and internal agents to one memory plane. Query context, commit outcomes, and preserve continuity across every tool.

memory-context
memory-recall
memory-commit
memory-relate
memory-brief
memory-forget

Main + subagent delegation

Main agents orchestrate strategy while subagents execute tightly scoped jobs with bounded tools and contextual slices.

Main Agent: plan + route
|
Subagent A: research scope
Subagent B: implementation scope
Subagent C: validation scope

Skills + tool surfaces

Attach domain skills to agents, gate tools by responsibility, and keep capabilities explicit so delegation remains auditable.

SkillsTool PoliciesMCP EndpointsRole-specific Prompts

Token and context management

Keep prompts lean by routing memory retrieval: fast paths for known patterns, deeper recall only when confidence or novelty requires it.

Deterministic recall for frequent intents
Cached execution for repeated workflows
Agentic retrieval only for complex or novel asks
Memory filters by source, score, and recency

Classification + scoping

Every capture is typed and connected so agents retrieve usable context instead of raw logs. Scope follows project, role, and task intent.

Classify: decision, task, knowledge, event
Relate: people, projects, commitments
Scope: project-tag + role boundaries
Decay: stale context loses weight

One memory across web and mobile

Capture on mobile, monitor on web, and execute through MCP-enabled agents. Same memory graph, same context lineage, everywhere.

Web Control Console
Mobile Capture
MCP Agent Clients
Cross-team Shared Context
Unified trust boundaries

Control plane internals

Architecture for durable human-agent trust

LittleGuy is designed as a Memory Control Plane: structured memory, scoped execution, and portable context across web, mobile, and MCP clients.

Graph + Vector
Memory substrate

Neo4j relationships + pgvector semantic recall in a dual-store architecture.

26 node types
Typed memory

Decisions, tasks, people, events, documents, and more for higher-precision retrieval.

OAuth + PKCE
MCP auth

Token issuance, refresh, and revocation for secure agent-to-memory connectivity.

Latency aware
Runtime discipline

Deterministic, cached, and agentic retrieval modes keep token use and latency under control.

Async memory lifecycle

4 workers
MEMORY_EXTRACT

Parses raw capture into structured objects: entities, facts, commitments, and decisions.

CLASSIFY

Assigns node type and confidence so retrieval can route to relevant context classes.

EDGE_INFER

Connects people, projects, and knowledge over time to maintain relationship-aware recall.

TEMPORAL_DECAY

Reduces stale relevance unless reinforced, keeping memory fresh without manual cleanup.

Typed memory schema

PersonOrganizationProjectTaskDecisionKnowledgeInsightQuestionEventDocumentTopicAreaGoalHabitReflectionMemoryCommitmentResourceQuoteBookmarkMeetingNoteContactThreadTagAction
Governance primitives
Delegation + scoping

Main agents delegate tasks to subagents with explicit boundaries for memory and tools, reducing accidental context bleed.

Skill-aware execution

Skills and tool surfaces align to roles so each agent uses the right capabilities for the right class of work.

Token management

Recall policies prioritize high-signal context first, then expand only when required by novelty or ambiguity.

MCP as the trust transport layer

OAuth 2.0 + PKCE secures client connectivity while scoped tokens, revocation, and policy-aware tool surfaces keep memory access intentional. This is the foundation for reliable collaboration between humans, main agents, and delegated subagents.

Claude.ai MaxChatGPT MCPGeminiWebMobile
LittleGuy
LittleGuy
Under the HoodPrivacyTermsTwitterLinkedIn
© 2026 LittleGuy. All rights reserved.