Reliable Engineering

High-assurance workflow for critical features using multi-agent competitive generation, independent evaluation, and evidence-based synthesis to produce superior solutions.

For simple features that don't require competitive exploration, use Feature Development workflow.

When to Use

  • Quality-critical implementations - Authentication, payment processing, data validation

  • Novel or ambiguous requirements - No clear "right answer", multiple valid approaches

  • High-stakes architectural decisions - API design, schema design, core algorithms

  • Avoiding local optima - When single-agent reflection might miss better approaches

When NOT to Use

  • Simple, well-defined tasks with obvious solutions

  • Time-sensitive changes where speed matters more than exploration

  • Trivial bug fixes or typos

  • Tasks with only one viable approach

Plugins Needed

Workflow

How It Works

1. Competitive Implementation

Use /sadd:do-competitively to generate multiple solutions, evaluate them independently, and synthesize the best elements.

What happens:

  1. 3 agents independently design and implement solutions with self-critique

  2. 3 judges evaluate each solution using structured rubrics with verification

  3. Adaptive strategy selects: polish the winner, redesign if all flawed, or synthesize best elements

For specific output location:

With custom evaluation criteria:

After completion, review the synthesized solution to ensure it meets your requirements.

2. Write Tests

Use /tdd:write-tests to generate comprehensive test coverage for the synthesized solution.

Or with specific focus:

Verify all tests pass before continuing.

3. Review Local Changes

Use /code-review:review-local-changes for final multi-agent validation.

Address Critical and High priority findings before committing.

4. Create Commit

Use /git:commit to create a well-formatted conventional commit.

Quality Comparison

Aspect
Feature Development
Reliable Engineering

Agents

1 (with self-reflection)

3 generators + 3 judges

Exploration

Single path

Multiple competing approaches

Issue Detection

40-60% (self-critique)

70-85% (competitive + judges)

Cost

Lower

4-6x higher

Time

Faster

Slower

Best For

Simple, clear tasks

Critical, ambiguous tasks

Advanced: Combining with Tree of Thoughts

For tasks requiring exploration before commitment, use /sadd:tree-of-thoughts first:

Advanced: Debate-Based Evaluation

For highest-stakes decisions where consensus is critical:

Tips

  • Reserve for critical work - The 4-6x cost overhead is only justified for high-stakes implementations

  • Specify criteria - Custom evaluation criteria improve judge alignment with your priorities

  • Review synthesis - Always validate the final synthesized solution makes coherent sense

  • Iterate if needed - If REDESIGN strategy triggers, provide more context on second attempt

  • Use for learning - Competitive execution reveals trade-offs between approaches

Theoretical Foundation

This workflow combines research-backed techniques:

Technique
Source
Benefit

Constitutional AI Self-Critique

Bai et al., 2022

40-60% issue reduction before review

Chain of Verification

Dhuliawala et al., 2023

Reduces judge bias

Multi-Agent Debate

Du et al., 2023

Diverse perspectives improve reasoning

Self-Consistency

Wang et al., 2022

Multiple paths improve reliability

Last updated