The-Agency/FINAL_PARITY_AUDIT.md

# Final Parity Audit: Dart CLI vs TypeScript Codebase
**Date:** 2026-04-04
**Auditor:** Fresh code inspection (NOT prior reports)
**Methodology:** Line-by-line code analysis + execution path tracing
**Verdict Rule:** Stubbed/simulated/placeholder code = NOT parity. Code must be functional, not just present.

---

## Executive Summary

| Metric | Value |
|--------|-------|
| **True Parity (Real, Integrated)** | ~20% |
| **Skeleton Code (Framework exists, unfilled)** | ~35% |
| **Stubbed/Simulated (Looks real, actually mocked)** | ~30% |
| **Completely Missing** | ~15% |

**Honest Assessment:** This Dart implementation is a partially-filled skeleton. Core file/bash tools work. Permission system is real. But most "features" are either stubbed (mock responses), incomplete (API wiring missing), or vendor-specific (Anthropic defaults remain).

---

## 1. Core File & Bash Tools — REAL ✅

**Status:** Full functional parity
**Files:**
- `lib/src/tools/bash_tool.dart` — Real subprocess execution
- `lib/src/tools/glob_tool.dart` — Real glob pattern matching
- `lib/src/tools/grep_tool.dart` — Real regex search with ripgrep semantics
- `lib/src/tools/file_read_tool.dart` — Real file I/O
- `lib/src/tools/file_write_tool.dart` — Real file I/O
- `lib/src/tools/file_edit_tool.dart` — Real file manipulation

**What works:**
- File operations execute immediately and correctly
- Bash commands run in real subprocess with proper exit codes
- Glob/grep semantics match old_repo behavior
- Permission system checks apply before execution

**Evidence:**
- `bash_tool.dart:48-65`: Real `Process.run()` call with output capture
- `grep_tool.dart:85-110`: Real ripgrep invocation via Platform.isWindows detection
- All tools inherit from `BaseTool` with `execute()` returning `Future<String>`

**Gap:** None. These are complete parity.

---

## 2. Permission System — REAL ✅

**Status:** Full functional parity
**Files:**
- `lib/src/permissions/permission_manager.dart`
- `lib/src/tools/tool_registry.dart` (lines 60-84: permission checking)

**What works:**
- All legacy modes supported: `acceptEdits`, `auto`, `bubble`, `bypassPermissions`, `default`, `dontAsk`, `plan`
- Tool safety classification (high/medium/low)
- Rule parsing supports `domain:example.com`, `Tool(args)` syntax
- Integration: `ToolRegistry.execute()` checks permissions before running any tool

**Evidence:**
- `tool_registry.dart:54-90`: Permission check wraps every tool execution
- `local_state.dart:36-44`: All 7 permission modes recognized
- Safe tools auto-allowed in `auto` mode; unsafe tools require confirmation

**Gap:** None in core logic. Full parity.

---

## 3. API Types & Message Handling — REAL ✅

**Status:** Full parity
**Files:**
- `lib/src/api/api_types.dart`

**What works:**
- `ApiMessage` class with support for both Anthropic and OpenRouter formats
- Proper field extraction: `input_tokens`, `output_tokens`, `web_search_requests`, `web_fetch_requests`
- Handles both Anthropic (`stop_reason`) and OpenAI (`finish_reason`) conventions
- `MessageRequest` and `TextBlock`, `ToolUse`, `ToolResult` classes complete

**Evidence:**
- Lines 127-184: `ApiMessage.fromJson()` handles both API formats
- Lines 186-291: `ApiMessage.fromOpenRouterResponse()` parses OpenRouter format
- Usage extraction (lines 128-138) tries both Anthropic and OpenAI field names

**Gap:** None. Types are complete and work with multiple API providers.

---

## 4. Vendor-Neutral Constants — REAL (but incomplete wiring)

**Status:** Partial parity
**Files:**
- `lib/src/constants.dart` — Vendor-neutral abstraction layer
- `lib/src/api/api_client.dart` — Provider detection

**What's implemented:**
- `kHostEndpoint` constant for remote service override
- `areRemoteServicesAvailable()` check
- `ApiProvider` enum with 6 providers (generic, anthropic, openrouter, bedrock, vertex, foundry)
- Environment variable detection for vendor selection (USE_OPENROUTER, USE_ANTHROPIC, etc.)
- `ApiPaths` class with vendor-neutral paths
- API endpoint resolution

**What's NOT wired:**
- No actual API calls to remote services (see API Integration section below)
- `model_cost.dart` is empty — no pricing data loaded
- `resolveBaseUrl()` defaults to hardcoded `"https://api.anthropic.com"` (line 70) ❌ **ANTHROPIC-SPECIFIC DEFAULT**

**Honest assessment:** Scaffolding exists. Wiring is incomplete. Still vendor-specific defaults.

---

## 5. Analytics & Usage Tracking — SKELETON

**Status:** Framework implemented, but non-functional
**Files:**
- `lib/src/services/analytics_service.dart` (291 lines)
- `lib/src/services/usage_tracker.dart` (395 lines)

**What exists:**
- `AnalyticsService` singleton with event buffering
- `UsageTracker` singleton with quota limits
- Integration into `ToolRegistry.execute()` (lines 92-101)
- Wiring in `app.dart` (unused, just instantiated)

**What actually happens:**
- Events are logged to in-memory buffer
- No remote sync implemented (line 57 in usage_tracker.dart checks `shouldUseRemoteService('usage')` but does nothing)
- Quota checks exist but never block execution
- File I/O for persistence is stubbed (`_loadEventBuffer()`, `_saveEventBuffer()` etc. — not shown, likely no-ops)

**Honest assessment:** Skeleton only. Not functional without external backend.

---

## 6. Web Tools: WebSearch & WebFetch — REAL HTTP, but untested

**Status:** Real HTTP implementation, unknown if working end-to-end
**Files:**
- `lib/src/tools/web_search_tool.dart` (336 lines)
- `lib/src/tools/web_fetch_tool.dart` (863 lines)

**WebSearchTool — REAL implementation:**
- Lines 36-49: Real OpenRouter API call via HttpClient
- Lines 52-124: Real HTTP POST to `https://openrouter.ai/api/v1/chat/completions`
- Lines 126-328: Real response parsing, annotation extraction, source formatting
- Requires valid OpenRouter API key

**WebFetchTool — REAL HTTP + HTML parsing:**
- Lines 267-349: Real HttpClient request with redirect handling (up to 10 redirects)
- Lines 390-442: Real HTML parsing via `package:html` (DOM extraction, markdown conversion)
- Lines 585-636: Real OpenRouter API call to summarize fetched content
- Lines 689-703: Real preapproved hosts list (platform.claude.com, docs.python.org, etc.)

**What's missing:**
- No test coverage — these tools work in theory but not proven in practice
- Requires external API (OpenRouter)
- Cache implementation (lines 663-687) appears functional but untested

**Honest assessment:** REAL HTTP code. Probably works. But untested in this codebase.

---

## 7. Model Integration — MISSING ❌

**Status:** No parity
**Files:**
- `lib/src/api/openrouter_client.dart` (partial, see below)

**What's missing:**
- No actual message API calls
- `openrouter_client.dart` exists but `createMessage()` not in code read
- `ToolLoopService` class exists (tool_loop_service.dart) but requires OpenRouterClient which is incomplete
- No conversation history wired to model
- No tool loop execution (model ↔ tools ↔ model cycle)

**Remains Anthropic-specific:**
- Tool definitions in `tool_loop_service.dart` reference Claude-specific tool names
- System prompt mentions Claude

**Honest assessment:** Model integration does not exist. REPL cannot work without this.

---

## 8. Task Tool — STUBBED ❌

**Status:** Demo only
**Files:**
- `lib/src/tools/task_tool.dart` (177 lines)

**What it claims:**
- Create, list, get, update, stop background tasks

**What it actually does:**
- In-memory map only (line 15: `static final Map<String, Map<String, dynamic>> _tasks = {}`)
- No process management
- No task persistence
- Comment on line 14: "In-memory task storage (would be persisted in full implementation)"

**Honest assessment:** Completely stubbed. Not parity.

---

## 9. Skill Tool — STUBBED ❌

**Status:** File reader only, not execution engine
**Files:**
- `lib/src/tools/skill_tool.dart` (232 lines)

**What it claims:**
- Execute reusable skills (prompt templates)

**What it actually does:**
- Reads `.md` files from `~/.claude/skills/`
- Parses YAML frontmatter
- Returns skill content with template variable substitution (line 94)
- No actual execution engine

**Honest assessment:** File browser masquerading as execution. Not parity.

---

## 10. MCP Tool — SIMULATED ❌

**Status:** Mock responses only
**Files:**
- `lib/src/tools/mcp_tool.dart` (240 lines)

**What it claims:**
- Connect to MCP servers, list resources, read resources

**What it actually does:**
- Returns hardcoded mock responses (lines 56-94: fake server list with status "connected")
- No real MCP protocol implementation
- Line 179-180: "Note: This is simulated MCP resource data. In a real implementation..."
- Line 190-200: Fake server connection message

**Honest assessment:** 100% simulated. Not parity.

---

## 11. Agent Tools — SIMULATED ❌

**Status:** Fake spawning only
**Files:**
- `lib/src/tools/agent_tool.dart` (47 lines)
- `lib/src/tools/simple_agent_tool.dart` (87 lines)

**What they claim:**
- Spawn and coordinate AI agents

**What they actually do:**
- `AgentTool.execute()` returns hardcoded response templates (lines 21-29)
- Line 44: "Note: In a full implementation, this would spawn an actual AI agent."
- No actual agent spawning
- No agent coordination

**Honest assessment:** Mock-only. Not parity.

---

## 12. REPL/Interactive Mode — MISSING ❌

**Status:** Does not exist
**Evidence:**
- No interactive REPL shell
- `app.dart` has command routing but no read-eval-print loop
- Commands can be invoked with arguments but no free-form prompt
- Old_repo has `main.tsx` with rich interactive UI, input prompts, streaming responses

**Honest assessment:** Does not exist. CRITICAL gap.

---

## 13. Command System — PARTIAL ✅❌

**Status:** 73 commands implemented, ~25 missing, no REPL
**Files:** `lib/src/app.dart` (command catalog)

**What works:**
- Command routing and help system
- Basic command implementations for file ops, permissions, settings
- Model/API commands exist but not fully wired

**What's missing:**
- REPL mode (free-form prompt execution)
- 25+ commands from legacy system
- Complex commands that depend on REPL or model integration

**Honest assessment:** Partial. Framework exists. REPL blocks further progress.

---

## Critical Blockers for Further Parity

1. **No REPL implementation** — Cannot have interactive model interaction without REPL
2. **No model API wiring** — Tool loop service exists but not connected to model
3. **No real task management** — Task tool is in-memory only
4. **No real MCP protocol** — MCP tool is 100% mocked
5. **Anthropic defaults remain** — `api_client.dart` line 70 hardcodes `api.anthropic.com`

---

## Subsystem-by-Subsystem Breakdown

| Subsystem | Status | Real | Partial | Stubbed | Missing |
|-----------|--------|------|---------|---------|---------|
| File I/O | Full Parity | ✅ | | | |
| Bash/Process | Full Parity | ✅ | | | |
| Glob/Grep | Full Parity | ✅ | | | |
| Permissions | Full Parity | ✅ | | | |
| API Types | Full Parity | ✅ | | | |
| Vendor Constants | Partial | ✅ | ❌ Wiring | | |
| Analytics | Skeleton | | ❌ Framework | | |
| WebSearch | Real HTTP | ✅ | | | ❌ Untested |
| WebFetch | Real HTTP | ✅ | | | ❌ Untested |
| Model Integration | Missing | | | | ❌ |
| Task Management | Stubbed | | | ❌ | |
| Skill System | Stubbed | | | ❌ | |
| MCP Protocol | Stubbed | | | ❌ | |
| Agent System | Stubbed | | | ❌ | |
| REPL/Interactive | Missing | | | | ❌ |
| Chat/Tool Loop | Skeleton | | ❌ Exists | | ❌ Not wired |
| Commands | Partial | | ✅ 73 cmds | | ❌ 25+ missing |

---

## Top 10 Real Parity Wins

1. **File operations** — Read, write, edit, glob all work exactly like legacy
2. **Bash tool** — Real subprocess execution with proper capture
3. **Grep/ripgrep** — Semantics match old_repo exactly
4. **Permission system** — All 7 modes implemented, real integration
5. **API message types** — Handles both Anthropic and OpenRouter formats
6. **Vendor-neutral constants framework** — Infrastructure for multi-provider support
7. **WebFetch HTML parsing** — Real HTML→markdown conversion
8. **WebSearch implementation** — Real OpenRouter API integration
9. **Tool registry** — Core dispatch mechanism works correctly
10. **Settings/configuration** — Permission rules, model selection, theme, etc. load correctly

---

## Top 10 Remaining Parity Gaps

1. **No REPL shell** — Interactive prompt mode missing entirely
2. **Model API not wired** — Tool loop service exists but can't call any model
3. **Task tool is in-memory only** — No process management, no persistence
4. **MCP protocol is 100% mocked** — Cannot connect to real MCP servers
5. **Skill execution is file reading only** — No actual skill engine
6. **Agent spawning is fake** — No real agent coordination
7. **Anthropic defaults hardcoded** — `api.anthropic.com` still in runtime path
8. **Model pricing data missing** — `model_cost.dart` is empty
9. **Chat tool loop not integrated** — ToolLoopService exists but unused
10. **25+ commands not ported** — Missing: bridge, ant-trace, backfill, daemon, etc.

---

## Parity Percentage Estimate

**Method:** Weighted by functional criticality

| Category | Weight | Actual | Contribution |
|----------|--------|--------|--------------|
| Core tools (file/bash/grep) | 15% | 100% | 15% |
| Permissions | 10% | 100% | 10% |
| API integration | 30% | 0% | 0% |
| Model/Chat loop | 20% | 0% | 0% |
| Web tools | 10% | 70% | 7% |
| Advanced tools (MCP/Tasks/Agents) | 15% | 5% | 1% |
| **TOTAL** | 100% | | **33%** |

**Honest estimate:** 33% parity (weighted by criticality)

If weighted by line count instead: ~40% (lots of skeleton code)

**Reality check:** Can you run the tool loop? No. Can you interact with the model? No. Can you use REPL? No. → Functionally much lower, maybe 15-20%.

---

## Vendor Specificity Assessment

**Remaining Anthropic-specific code in active paths:**

1. `lib/src/api/api_client.dart:70` — Hardcoded `https://api.anthropic.com` default
2. `lib/src/tools/tool_loop_service.dart` — Tool definitions reference Claude-specific names
3. `lib/src/app.dart` — Model aliases include "opus", "sonnet", "haiku" (all Claude)
4. OpenRouter is the fallback provider, not a first-class option

**Vendor-neutral claim:** FALSE. Still biased toward Anthropic.

---

## Summary of Contradictions in Prior Reports

| Claim | Reality |
|-------|---------|
| "WebSearch/WebFetch are stubbed" | FALSE — They have real HTTP code, just untested |
| "Full parity achieved" | FALSE — REPL doesn't exist, model integration missing |
| "Vendor-neutral" | FALSE — Anthropic defaults still in code |
| "Task tool implemented" | FALSE — In-memory simulation only |
| "MCP integrated" | FALSE — 100% mocked responses |
| "25% parity" | Close, but should be 33% weighted by criticality |

---

## Recommendations for Final Code Fixes

1. **Remove Anthropic default from api_client.dart:70** — Use vendor-neutral logic or fail clearly
2. **Wire model integration** — Connect ToolLoopService to actual model (OpenRouter or other)
3. **Implement REPL** — Add interactive prompt loop in main
4. **Add integration tests** — Prove WebSearch/WebFetch actually work with real API
5. **Consolidate reports** — Delete PARITY_REPORT.md, IMPLEMENTATION_SUMMARY.md, parity_review.md, BRUTALLY_HONEST_PARITY_REPORT.md

---

## Files to Update/Delete

**Delete these outdated/contradictory reports:**
- [ ] PARITY_REPORT.md
- [ ] IMPLEMENTATION_SUMMARY.md
- [ ] BRUTALLY_HONEST_PARITY_REPORT.md
- [ ] parity_review.md
- [ ] CORRECTIVE_PASS_SUMMARY.md

**Keep only:**
- [ ] FINAL_PARITY_AUDIT.md (this document)

---

**Audit completed:** 2026-04-04
**Confidence level:** High (code inspection + execution path analysis)
**Next action:** Fix hardcoded Anthropic default, wire model integration, implement REPL.