First Principles Framework
Structured reasoning plugin that makes AI decision-making transparent and auditable through hypothesis generation, logical verification, and evidence-based validation.
Focused on:
Transparent reasoning - All decisions documented with full audit trails
Hypothesis-driven analysis - Generate competing alternatives before evaluating
Evidence-based validation - Computed reliability scores, not estimates
Human-in-the-loop - AI generates options; humans decide (Transformer Mandate)
Plugin Target
Make AI reasoning auditable - full trail from hypothesis to decision
Prevent premature conclusions - enforce systematic evaluation of alternatives
Build project knowledge over time - decisions become reusable knowledge
Enable informed decision-making - trust scores based on evidence quality
Overview
The FPF plugin implements structured reasoning using the First Principles Framework methodology developed by Anatoly Levenchuk a methodology for rigorous, auditable reasoning. The killer feature is turning the black box of AI reasoning into a transparent, evidence-backed audit trail.
The core cycle follows three modes of inference:
Abduction — Generate competing hypotheses (don't anchor on the first idea).
Deduction — Verify logic and constraints (does the idea make sense?).
Induction — Gather evidence through tests or research (does the idea work in reality?).
Then, audit for bias, decide, and document the rationale in a durable record.
The framework addresses a fundamental challenge in AI-assisted development: making decision-making processes transparent and auditable. Rather than having AI jump to solutions, FPF enforces generating competing hypotheses, checking them logically, testing against evidence, then letting developers choose the path forward.
Warning: This plugin loads the core FPF specification into context, which is large (~600k tokens). As a result it loaded into a subagent with Sonnet[1m] model. But such agent can consume your token limit quickly.
Implementation based on quint-code by m0n0x41d.
Quick Start
Workflow Diagram
Commands Overview
/fpf:propose-hypotheses - Decision Cycle
Execute the complete FPF cycle from hypothesis generation through evidence validation to decision.
Purpose - Make architectural decisions with full audit trail
Output -
.fpf/decisions/DRR-<date>-<topic>.mdwith winner and rationale
Arguments
Natural language description of the decision or problem. Examples: "What caching strategy should we use?" or "How should we deploy our application?"
How It Works - ADI Cycle
The workflow follows three inference modes:
Initialize Context - Creates
.fpf/directory structure and captures problem constraintsAbduction: Generate Hypotheses - FPF agent generates 3-5 generate plausible, diverse, and competing hypotheses in L0 folder. How it works:
You pose a problem or question
The AI (as Abductor persona) generates 3-5 candidate explanations or solutions
Each hypothesis is stored in
L0/(unverified observations)No hypothesis is privileged — anchoring bias is the enemy
Output: Multiple L0 claims, each with:
Clear statement of the hypothesis
Initial reasoning for plausibility
Identified assumptions and constraints
User Input - Presents hypothesis table, allows user to add alternatives
Deduction: Verify Logic (Parallel) - Checks each hypothesis against constraints and typing, promotes to L1 or invalidates How it works:
The AI (as Verifier persona) checks each L0 hypothesis for:
Internal logical consistency
Compatibility with known constraints
Type correctness (does the solution fit the problem shape?)
Hypotheses that pass are promoted to
L1/Hypotheses that fail are moved to
invalid/with explanation
Output: L1 claims (logically sound) or invalidation records.
Induction: Validate Evidence (Parallel) - Gather empirical evidence through tests or research, promotes L1 hypotheses to L2 How it works:
For internal claims: run tests, measure performance, verify behavior
For external claims: research documentation, benchmarks, case studies
Evidence is attached with:
Source and date (for decay tracking)
Congruence rating (how well does external evidence match our context?)
Claims that pass validation are promoted to
L2/
Output: L2 claims (empirically verified) with evidence chain.
Audit(Parallel) - Compute trust score R_eff using
WLNK (Weakest Link): Assurance = min(evidence levels)
Congruence Check: Is external evidence applicable to our context?
Bias Detection: Are we anchoring on early hypotheses?
Make Decision: Presents comparison table, selects winner, creates Design Rationale Record (DRR) which captures:
decision
alternatives considered
evidence
expiry conditions
Present Summary: Shows DRR, winner rationale, and next steps
Usage Examples
When to Use
Use it for:
Architectural decisions with long-term consequences
Multiple viable approaches requiring systematic evaluation
Decisions that need an auditable reasoning trail
Building up project knowledge over time Skip it for:
Quick fixes with obvious solutions Easily reversible decisions Time-critical situations where the overhead isn't justified
Best practices
Frame as decisions - "What X should we use?" or "How should we Y?"
Be specific about constraints - Include performance, cost, or time requirements
Add your own hypotheses - Don't rely only on AI-generated options
Review verification failures - Failed hypotheses reveal hidden constraints
Document for future reference - DRRs become project knowledge
/fpf:status - Check Progress
Show current FPF phase, hypothesis counts, and any warnings about stale evidence.
Purpose - Understand current state of reasoning process
Output - Status table with phase, counts, and warnings
Arguments
None required.
How It Works
Phase Detection: Identifies current ADI cycle phase (IDLE, ABDUCTION, DEDUCTION, INDUCTION, DECISION)
Hypothesis Count: Reports counts per knowledge layer (L0, L1, L2, Invalid)
Evidence Status: Lists evidence files and their freshness
Warning Detection: Identifies stale evidence, orphaned hypotheses, or incomplete cycles
Usage Examples
Example Output:
Best practices
Check before continuing - Know your current phase before proceeding
Address warnings - Stale evidence affects trust scores
Review invalid hypotheses - Understand why they failed
/fpf:query - Search Knowledge Base
Search the FPF knowledge base for hypotheses, evidence, or decisions with assurance information.
Purpose - Find and review stored knowledge with trust scores
Output - Search results with layer, R_eff, and evidence counts
Arguments
Keyword to search for, specific hypothesis name, or "DRR" to list decisions.
How It Works
Keyword Search: Searches hypothesis titles, descriptions, and evidence
Hypothesis Details: Returns full hypothesis info including layer, kind, scope, and R_eff
DRR Listing: Shows all Design Rationale Records with winner and rejected alternatives
Usage Examples
Example Output (keyword search):
Example Output (specific hypothesis):
Example Output (DRR listing):
Best practices
Search before starting new decisions - Reuse existing knowledge
Check R_eff scores - Higher scores indicate more reliable hypotheses
Review DRRs - Past decisions inform future choices
/fpf:decay - Manage Evidence Freshness
Check for stale evidence and choose how to handle it: refresh, deprecate, or waive.
Purpose - Maintain evidence validity over time
Output - Updated evidence status and trust scores
Evidence expires. A benchmark from six months ago might not reflect current performance. /fpf:decay shows you what's stale and gives you three options:
Refresh — Re-run tests to get fresh evidence
Deprecate — Downgrade the hypothesis if the decision needs rethinking
Waive — Accept the risk temporarily with documented rationale
Arguments
None required. Command is interactive.
How It Works
Staleness Check: Identifies evidence files past their freshness threshold
Options Presented: For each stale evidence:
Refresh: Re-run tests for fresh evidence
Deprecate: Downgrade hypothesis, flag decision for review
Waive: Accept risk temporarily with documented rationale
Trust Recalculation: Updates R_eff scores based on evidence changes
Usage Examples
Best practices
Run periodically - Evidence expires; benchmarks from 6 months ago may not reflect current performance
Document waivers - Always include rationale and expiration date
Refresh critical evidence - High-impact decisions deserve fresh data
/fpf:actualize - Reconcile with Codebase
Update the knowledge base to reflect codebase changes that may affect existing hypotheses.
Purpose - Keep knowledge synchronized with implementation
Output - Updated hypothesis validity and evidence relevance
This command serves as the Observe phase of the FPF's Canonical Evolution Loop (B.4). It reconciles your documented knowledge with the current state of the codebase by:
Detecting Context Drift: Checks if project files (like package.json) have changed, potentially making your context.md stale.
Finding Stale Evidence: Finds evidence whose carrier_ref (the file it points to) has been modified in git.
Flagging Outdated Decisions: Identifies decisions whose underlying evidence chain has been impacted by recent code changes.
How It Works
Change Detection: Identifies code changes since last actualization
Impact Analysis: Determines which hypotheses and evidence are affected
Validity Update: Marks affected hypotheses for re-verification if needed
Report Generation: Summarizes changes and recommended actions
Usage Examples
Best practices
Run after major changes - Refactoring may invalidate previous assumptions
Review impact report - Some hypotheses may need re-evaluation
Update evidence - Changed code may need new benchmarks
/fpf:reset - Start Fresh
Archive the current session and return to IDLE state for a new reasoning cycle.
Purpose - Clear current state while preserving history
Output - Archived session in
.fpf/sessions/
Arguments
None required. Command is interactive.
How It Works
Reset Type Selection:
Soft Reset: Archive current session, start fresh (recommended)
Hard Reset: Delete all FPF data (cannot be undone)
Decision Reset: Keep hypotheses, re-evaluate from earlier phase
Session Archive: Creates timestamped archive in
.fpf/sessions/State Clear: Clears knowledge directories based on reset type
Usage Examples
When to Reset
Starting a new problem
Soft reset (archive)
Wrong direction, start over
Soft reset
Testing/learning FPF
Hard reset
Re-evaluate with new info
Decision reset
Context changed significantly
Soft reset + update context
Best practices
Prefer soft reset - Always preserve history for reference
Hard reset only for testing - Production knowledge is valuable
Decision reset for pivots - When new information changes the equation
Available Agents
fpf-agent
FPF reasoning specialist for hypothesis generation, verification, validation, and trust calculus using the ADI cycle
All commands
fpf-agent
Purpose: Executes FPF reasoning tasks with file operations for persisting knowledge state.
Tools: Read, Write, Glob, Grep, Bash (mkdir, mv, touch)
Responsibilities:
Create hypothesis files in knowledge layers
Move files between L0/L1/L2/invalid directories
Create evidence files and audit reports
Generate Design Rationale Records (DRRs)
Key Concepts
ADI Cycle
Abduction-Deduction-Induction reasoning loop
Knowledge Layers
L0 (Conjecture) -> L1 (Substantiated) -> L2 (Corroborated)
WLNK
Weakest Link principle: R_eff = min(evidence_scores)
Holon
Knowledge unit with identity, layer, kind, and assurance scores
DRR
Design Rationale Record documenting decisions
Transformer Mandate
AI generates options; humans decide
Knowledge Layers (Epistemic Status)
L0
Conjecture
Unverified hypothesis
Generate hypotheses
L1
Substantiated
Passed logical check
Verify logic
L2
Corroborated
Empirically validated
Validate evidence
Invalid
Falsified
Failed verification
FAIL verdict
Congruence Levels
CL3
Same (internal test)
None
CL2
Similar (related project)
Minor
CL1
Different (external docs)
Significant
The Transformer Mandate
A core FPF principle: A system cannot transform itself.
AI generates options with evidence
Human decides
Making architectural choices autonomously is a PROTOCOL VIOLATION
This ensures accountability and prevents AI from making unsupervised decisions.
Directory Structure
The FPF plugin creates and manages this directory structure:
When to Use FPF
Use it for:
Architectural decisions with long-term consequences
Multiple viable approaches requiring systematic evaluation
Decisions needing auditable reasoning trails
Building project knowledge over time
Skip it for:
Quick fixes with obvious solutions
Easily reversible decisions
Time-critical situations
Theoretical Foundation
Core Methodology
First Principles Framework (FPF) - Original methodology by Anatoly Levenchuk for structured epistemic reasoning
quint-code - Implementation this plugin is based on
Supporting Concepts
Abduction-Deduction-Induction Cycle - Classical scientific reasoning methodology
Weakest Link Principle - Trust computation based on minimum evidence quality
Design Rationale - Documenting not just decisions but the reasoning behind them
Last updated