PARALLAX
Deep-dive into agent frameworks, sandboxing, and compartmentalization for safe code analysis with Claude
Introduction
Etymology
Evokes multiple perspectives on the same system without any one observer seeing the whole, which matches compartmentalized, summary‑driven architecture. Feels technical/neutral (optics / astronomy vibe), not security- or NSFW‑coded, and has no obvious tie to CERBERUS or explicit content domains.
Project requirements
MICA must incorporate support for Mermaid diagram tool. Currently not supported.
Related project suggestions
- UMBRA
- APERTURE
- SECTOR
PARALLAX
Deep-dive into existing agent frameworks
Key things to understand:
- Tool protocol (how tools are defined/invoked)
- Provider abstraction (how it talks to Claude/OpenAI/etc)
- Streaming & events model
- How context/prompts are managed
Also look at:
claude-sdk(Anthropic's official SDK)langchaintool system (inspiration, even if you don't use it)- OpenAI Agents SDK
- Inspect how Crush's tool validation works (
internal/agent/tools/) - GitButler codegen
- Charmbracelet CRUSH / Fantasy
- GROK CLI
- GEMINI CLI
- KIRO CLI
Sandboxing technology research
Initial study reveals compartmentalization requires real isolation, not just policy. Research:
Container/VM options:
gvisor- userspace kernel, good balance of isolation + performancefirecracker- microVMs, used by AWS Lambdabubblewrap- lightweight sandboxing, used by Flatpakdocker/podman- may be too heavy, but understood
Linux namespace isolation:
- Mount namespaces (filesystem isolation)
- PID namespaces (process isolation)
- Network namespaces (network isolation)
- User namespaces (privilege separation)
Filesystem access control:
- Landlock LSM - modern Linux kernel feature for path-based restrictions
seccomp-bpf- syscall filtering- FUSE - userspace filesystem for interception/redaction
Architecture diagram sessions
Create these diagrams (use Excalidraw, Mermaid, or paper):
A. Trust Boundary Diagram
[Untrusted] ←→ [Redaction Layer] ←→ [Trusted]
Agent Sanitization Real Code
Tools Summary Engine Filesystem
LLM Context Access Control SecretsB. Component Architecture
- Agent Engine - orchestration, multi-agent coordination
- Redaction Engine - sanitization, pseudonymization
- Sandbox Manager - filesystem, shell, tool execution
- Summarization Pipeline - file → folder → subsystem → system
- Policy Engine - role-based access, allowed tools per compartment
- Tool Registry - custom tools, validation, restriction
C. Data Flow Diagram
User Request → Task Router → Compartment Selector
→ Summary Provider → Agent (sandboxed) → Tool Execution (restricted)
→ Response Sanitizer → UserD. Hierarchical Summarization Tree
System Summary
├── Subsystem A Summary
│ ├── Folder A1 Summary
│ │ ├── File 1 Interface Extract
│ │ └── File 2 Interface Extract
│ └── Folder A2 Summary
└── Subsystem B Summary
└── ...Prototype critical unknowns
Before full architecture, build tiny spike prototypes (1-2 days each) to validate assumptions:
Spike 1: Sandboxed shell execution
- Can you reliably restrict file access?
- Can you intercept and redact file reads?
- What's the performance overhead?
Spike 2: Real-time code redaction
- Can you parse arbitrary code and replace sensitive identifiers?
- How do you handle different languages?
- What about string literals, comments, error messages?
Spike 3: Multi-agent coordination
- How do you pass sanitized summaries between agents?
- How do you maintain context isolation?
- What's the orchestration model?
Spike 4: Tool restriction
- How do you define "allowed tools per role"?
- How do you validate tool calls before execution?
- Can you restrict tool parameters (e.g., allowed paths)?
Planning Deliverables (Before Coding)
Create these documents:
1. System Architecture Doc
- Component diagram with responsibilities
- API boundaries between components
- Data flow for common operations
- Technology choices with justification
2. Security Model Doc
- Threat model (what are you protecting against?)
- Trust boundaries
- Isolation mechanisms per boundary
- Redaction rules & bypass scenarios
3. Tool Protocol Spec
{
"tool": "read_file",
"compartment": "service-auth",
"allowed_paths": ["/compartments/auth/**"],
"redaction": "sanitize_identifiers",
"role_requirements": ["dev", "reviewer"]
}4. Compartmentalization Rules
- How do you decide compartment boundaries?
- What's in a compartment summary vs. raw access?
- How do roles map to compartments?
- Escalation paths (when agent needs broader context)
5. MVP Feature List
Prioritize ruthlessly. First version might be:
- ✅ Single agent with file read restriction
- ✅ Basic identifier redaction for one language
- ✅ Simple tool allowlist
- ❌ Multi-agent hierarchy (later)
- ❌ Full role system (later)
- ❌ Cross-language redaction (later)
Recommended Starting Point
After research & planning, start with this minimal but complete foundation:
claude-compartment/
├── cmd/
│ └── claude-c/ # CLI entry point
├── internal/
│ ├── engine/ # Agent orchestration (based on Fantasy)
│ ├── sandbox/ # Filesystem/shell isolation
│ ├── redactor/ # Code sanitization
│ ├── tools/ # Restricted tool implementations
│ ├── policy/ # Access control rules
│ └── config/ # Role/compartment configuration
├── compartments/ # Actual compartmented code
└── .claude-c/ # Summaries cachePhase 1 implementation:
- Basic CLI + Fantasy SDK integration (chat with Claude)
- Filesystem sandbox (read-only, single compartment)
- Simple redactor (regex-based identifier replacement)
- One custom tool (e.g.,
read_compartment_file) - Hardcoded policy (one role, one compartment)
Then iterate:
- Phase 2: Multi-compartment, summary generation
- Phase 3: Multi-agent hierarchy
- Phase 4: Full role system, tool surfaces
- Phase 5: Production hardening
Critical Questions to Answer in Planning
- Primary language target? (Go, Python, Rust?) - affects redaction complexity
- Deployment model? (local CLI, server, both?)
- Summary storage? (cached, persistent DB, regenerated?)
- Agent model? (single LLM with routing, or actual multiple agent instances?)
- Redaction scope? (identifiers only, or also logic/algorithms/comments?)
- Performance requirements? (can summarization be slow, or needs to be real-time?)