mirror of
https://github.com/affaan-m/everything-claude-code.git
synced 2026-02-08 07:33:08 +08:00
Major OpenCode integration overhaul: - llms.txt: Comprehensive OpenCode documentation for LLMs (642 lines) - .opencode/plugins/ecc-hooks.ts: All Claude Code hooks translated to OpenCode's plugin system - .opencode/tools/*.ts: 3 custom tools (run-tests, check-coverage, security-audit) - .opencode/commands/*.md: All 24 commands in OpenCode format - .opencode/package.json: npm package structure for opencode-ecc - .opencode/index.ts: Main plugin entry point - Delete incorrect LIMITATIONS.md (hooks ARE supported via plugins) - Rewrite MIGRATION.md with correct hook event mapping - Update README.md OpenCode section to show full feature parity OpenCode has 20+ events vs Claude Code's 3 phases: - PreToolUse → tool.execute.before - PostToolUse → tool.execute.after - Stop → session.idle - SessionStart → session.created - SessionEnd → session.deleted - Plus: file.edited, file.watcher.updated, permission.asked, todo.updated - 12 agents: Full parity - 24 commands: Full parity (+1 from original 23) - 16 skills: Full parity - Hooks: OpenCode has MORE (20+ events vs 3 phases) - Custom Tools: 3 native OpenCode tools The OpenCode configuration can now be: 1. Used directly: cd everything-claude-code && opencode 2. Installed via npm: npm install opencode-ecc
1.6 KiB
1.6 KiB
description, agent
| description | agent |
|---|---|
| Run evaluation against acceptance criteria | build |
Eval Command
Evaluate implementation against acceptance criteria: $ARGUMENTS
Your Task
Run structured evaluation to verify the implementation meets requirements.
Evaluation Framework
Grader Types
-
Binary Grader - Pass/Fail
- Does it work? Yes/No
- Good for: feature completion, bug fixes
-
Scalar Grader - Score 0-100
- How well does it work?
- Good for: performance, quality metrics
-
Rubric Grader - Category scores
- Multiple dimensions evaluated
- Good for: comprehensive review
Evaluation Process
Step 1: Define Criteria
Acceptance Criteria:
1. [Criterion 1] - [weight]
2. [Criterion 2] - [weight]
3. [Criterion 3] - [weight]
Step 2: Run Tests
For each criterion:
- Execute relevant test
- Collect evidence
- Score result
Step 3: Calculate Score
Final Score = Σ (criterion_score × weight) / total_weight
Step 4: Report
Evaluation Report
Overall: [PASS/FAIL] (Score: X/100)
Criterion Breakdown
| Criterion | Score | Weight | Weighted |
|---|---|---|---|
| [Criterion 1] | X/10 | 30% | X |
| [Criterion 2] | X/10 | 40% | X |
| [Criterion 3] | X/10 | 30% | X |
Evidence
Criterion 1: [Name]
- Test: [what was tested]
- Result: [outcome]
- Evidence: [screenshot, log, output]
Recommendations
[If not passing, what needs to change]
Pass@K Metrics
For non-deterministic evaluations:
- Run K times
- Calculate pass rate
- Report: "Pass@K = X/K"
TIP: Use eval for acceptance testing before marking features complete.