Ops Deck
Multi-Agent Local Stack — Manus · Perplexity · Claude · Gemini · Codex
🧠 Multi-Agent Local Stack · Plug-and-Play
Your own Manus-like system,
fully under your control
fully under your control
Five AI agents running locally via Docker — working on the same project, sharing files and vector memory,
reviewing each other's outputs, and iterating until quality passes a judge threshold.
Perplexity researches. Manus builds. Claude critiques. Loop until score ≥ 8.
5
Agents
6
Services
3
Context layers
≥8
Quality gate
Part of AgentOS
This Ops Deck is the Intelligence + Execution layers of AgentOS — the full 7-layer AI operating system. View the complete system architecture:
1
Architecture Overview
User → Orchestrator → Workers → Shared Context
User
→
Orchestrator (Node.js)
:3000 · routes + loops
↓
Perplexity
Research Agent
sonar-pro
sonar-pro
·
Manus
Build Agent
gpt-4o-mini
gpt-4o-mini
·
Judge (Claude)
Critic Agent
claude-3-opus
claude-3-opus
↕
Shared Context Layer
/data (files) · Postgres (state) · Qdrant (vector memory)
Orchestration Loop
1
Perplexity Research Phase
POST
http://perplexity_worker:8000/run — queries sonar-pro with the task goal, writes findings to /data/docs/research.md2
Manus Build Phase
POST
http://manus_worker:8000/run — reads research.md, builds implementation via gpt-4o-mini, writes to /data/code/implementation.py3
Judge Review Phase
POST
http://judge_worker:8000/review — Claude reviews implementation.py, scores 1–10, writes feedback to /data/docs/review.md?
Gate Score < 8 → Revise Loop
If review score < 8: POST
manus_worker:8000/revise — Manus reads review.md + current code, improves and rewrites. Judge re-reviews. Loop continues until score ≥ 8.✓
Done — quality gate passed
Score ≥ 8: orchestrator returns
{ done: true }. Final artifacts in /data/code/ and /data/docs/.2
Agent Workers
5 agents · 3 primary + 2 extended
Perplexity
Research Agent
Deep web research using
sonar-pro — the same model powering Perplexity's answer engine.
Receives the task goal, queries the live web, synthesizes findings into structured research.md.
This grounds Manus's implementation in real, current information rather than stale training data.
sonar-pro
api.perplexity.ai
PERPLEXITY_API_KEY
worker.py endpoint
@app.post("/run")
def run(state):
# POST to sonar-pro with goal
text = query_perplexity(state["goal"])
write("/data/docs/research.md", text)
return {"status": "done"}
Manus
Build Agent
Execution agent — reads research, builds implementation, revises based on judge feedback.
Uses
gpt-4o-mini for cost-effective code synthesis.
Has two endpoints: /run for initial build and /revise for improvement loops —
the core of the self-healing cycle.
gpt-4o-mini
api.openai.com
OPENAI_API_KEY
worker.py endpoints
@app.post("/run") # initial build
@app.post("/revise") # improvement loop
reads: research.md | review.md
writes: /data/code/implementation.py
Judge (Claude)
Critic Agent
Quality gate — Claude reads the implementation and returns structured feedback + a numeric score.
Uses
claude-3-opus-20240229 for the best critique quality.
Score < 8 triggers a Manus revision. Score ≥ 8 marks the task complete.
Writes feedback to review.md so Manus can act on it.
claude-3-opus
api.anthropic.com
ANTHROPIC_API_KEY
worker.py endpoint
@app.post("/review")
reads: implementation.py
writes: /data/docs/review.md
return {"score": 1–10} # gate at 8
Extended Agent Roles (plug in as additional workers)
Gemini
Multimodal Reasoning
Add as a second research worker for multimodal tasks — image analysis, long-context document reading (2M tokens), or cross-format synthesis. Replace or augment the Perplexity research step for document-heavy projects.
gemini-2.5-pro
GOOGLE_API_KEY
Codex
Test Generation
Add between Manus /run and Judge /review — Codex writes a test suite against the implementation before Claude judges it. Tests failing = automatic revise loop before the score gate, saving Judge API calls.
gpt-5.4
OPENAI_API_KEY
3
Shared Context Layer
/data · Postgres · Qdrant
/data Volume
Shared Docker volume mounted at
/app/data on every container. Any agent write is immediately visible to all other agents — no API calls needed for file exchange.
/data
├── /docs
│ ├── research.md ← Perplexity writes
│ └── review.md ← Judge writes
├── /code
│ └── implementation.py ← Manus writes
└── state.json ← Orchestrator tracks
Postgres (State)
Persistent relational state — task metadata, agent run history, loop counts, scores per iteration. Survives container restarts. Lets the orchestrator query "how many revise loops have run for task X?"
POSTGRES_USER: ai
POSTGRES_PASSWORD: ai
POSTGRES_DB: ai
Port: 5432 (internal)
Image: postgres:15
Qdrant (Vector Memory)
Long-term semantic memory — agents store key decisions, research findings, and past solutions as vectors. Future tasks retrieve relevant context via similarity search, so agents don't repeat prior mistakes or redo prior research.
Port: 6333 (exposed)
Image: qdrant/qdrant
Collection: "memory"
client.upsert() to store
client.search() to recall
Qdrant integration snippet
from qdrant_client import QdrantClient
client = QdrantClient(host="qdrant", port=6333)
client.upsert(
collection_name="memory",
points=[{
"id": 1,
"vector": embedding,
"payload": {"text": decision}
}]
)
4
Docker Compose Services
6 services · docker-compose up --build
| Service | Type | Port | Image / Build | Env & Volumes |
|---|---|---|---|---|
| orchestrator | Orchestrator | 3000 | ./orchestrator (Node.js) | .env · ./data:/app/data |
| perplexity_worker | Worker | 8000 (internal) | ./workers/perplexity (FastAPI) | .env · ./data:/app/data |
| manus_worker | Worker | 8000 (internal) | ./workers/manus (FastAPI) | .env · ./data:/app/data |
| judge_worker | Worker | 8000 (internal) | ./workers/judge (FastAPI) | .env · ./data:/app/data |
| postgres | Infra | 5432 (internal) | postgres:15 | POSTGRES_USER/PASSWORD/DB |
| qdrant | Infra | 6333 | qdrant/qdrant | — |
docker-compose.yml (key structure)
version: "3.9"
services:
orchestrator:
build: ./orchestrator
ports: ["3000:3000"]
depends_on: [postgres, qdrant]
perplexity_worker: # Research
build: ./workers/perplexity
manus_worker: # Build + Revise
build: ./workers/manus
judge_worker: # Critique (Claude)
build: ./workers/judge
postgres: # State DB
image: postgres:15
qdrant: # Vector memory
image: qdrant/qdrant
ports: ["6333:6333"]
5
Environment Variables
Create .env in project root
Agent API Keys
OPENAI_API_KEYManus worker (gpt-4o-mini)
ANTHROPIC_API_KEYJudge worker (claude-3-opus)
GOOGLE_API_KEYGemini worker (optional)
PERPLEXITY_API_KEYPerplexity worker (sonar-pro)
Infrastructure
POSTGRES_USERai
POSTGRES_PASSWORDai (change in prod)
POSTGRES_DBai
QDRANT_HOSTqdrant (internal DNS)
6
Run It
Two commands to launch the full stack
Start all services
# Build images and start all 6 services
docker-compose up --build
# Wait for healthy status, then trigger a task
curl -X POST http://localhost:3000/run \
-H "Content-Type: application/json" \
-d '{"goal": "build a trading bot"}'
Orchestrator
localhost:3000
POST /run to trigger
Qdrant UI
localhost:6333
Browse vector memory
Output files
./data/
research.md · implementation.py · review.md
7
Upgrade Path
Expand the stack after the baseline works
Gemini Worker — Multimodal Reasoning
Add a second research worker using gemini-2.5-pro for tasks requiring image analysis, PDF parsing, or 2M-token context windows. Route multimodal research tasks to Gemini, text-only to Perplexity. Orchestrator picks based on task type in state.json.
Redis Pub/Sub — Real-Time Agent Triggers
Replace sequential HTTP calls with Redis publish/subscribe events. Each agent subscribes to its trigger channel — faster, non-blocking, and allows multiple agents to work in parallel. Add redis:alpine to docker-compose.
Next.js Dashboard — Visualize Agent Outputs
Add a Next.js frontend service that reads state.json + Postgres, showing live agent activity, loop counts, review scores per iteration, and final artifacts. Connects to the AI Shop dashboard you're reading now.
Task Graphs Instead of Linear Flows
Replace the linear Research→Build→Review chain with a DAG (directed acyclic graph) of tasks. Parallel sub-tasks — e.g., Perplexity researches API docs while Gemini reads existing codebase — then Manus merges both. Add a planner agent that generates the graph from the goal.
Git Auto-Commit Agent
Add a lightweight git worker that commits implementation.py after each successful review (score ≥ 8). Message includes agent, score, and iteration count:
feat(manus): impl v3 · judge 9/10. Full audit trail of the build loop.Evaluation Agent Swarm — Multiple Judges
Run 3+ judge workers in parallel (Claude, Gemini, GPT-4o as critics), aggregate their scores. Average score ≥ 8 triggers completion. Eliminates single-judge bias and produces richer review.md with multi-perspective feedback for Manus to revise against.
Agent Roles — Planner / Executor / Verifier
Add a Planner agent (Claude) that breaks the goal into sub-tasks before Manus executes them. Add a Verifier agent (Codex) that runs tests on the implementation before it reaches the Judge. Three distinct roles prevent the Executor from trying to do everything.
Self-Healing Loops — Retry with Modified Prompts
If Manus fails to improve (score doesn't go up after 2 revisions), the orchestrator automatically modifies the build prompt — adds constraints from review.md, changes the framing, reduces scope. Prevents infinite loops on intractable tasks.