Rungs — Use Cases

1.3ms

Full Pipeline

100%

GSM8K Accuracy

99.15%

CLadder (vs GPT-4: 70.4%)

LLM / GPU Required

Benchmarks Passed

🔍

For Prospects

Start with the Alvaka briefing

or pick your industry from the Enterprise section below

For Engineers

Jump to Research & Benchmarks

99.15% CLadder · 42 benchmarks · full technical proof

For Investors

Start with Business Overview

then Strategy — Roadmap and Go-To-Market at the bottom

🏠 Start Here — What is Rungs?

Plain English business case, then the technical proof.

Rungs — Plain English Business Overview

Start Here

Your analytics tells you what happened. Rungs tells you why — and what to do. Three questions no other tool can answer: Why did this happen? What will happen if we change X? What would have happened if we'd done Y? No math. No jargon. Just the business case.

The LLM vs The Engine — Where Value Lives

Mode 1 vs Mode 2

Direct answer to "97% of the value is in the LLM." Side-by-side comparison of Claude alone vs Rungs alone vs both together on the same question. Visual benchmark proof. Why the engine is the IP and the LLM is the front door.

🎯 Live Prospect Demos

Built for specific accounts. These are the real pitches.

Alvaka Networks Briefing

Live Demo

Full interactive walkthrough built for Oli Thordarson's team. Kill chain reconstruction of a ransomware attack, causal control testing, ROI simulator, and executive risk dashboard — all driven by the Rungs causal engine. No LLM. No guesswork.

Alvaka × Rungs — Cybersecurity Use Cases

Use Cases

Full catalog of Rungs applications for Alvaka's security practice. Lateral movement tracing, threat attribution, control gap analysis, incident response automation, and compliance reporting — each mapped to Alvaka's existing service lines.

🏢 Enterprise Use Cases

Deterministic root cause analysis across security, compliance, and 12 industries.

🔐 Security & Incident Response

PwC: Ransomware IR

Security

LockBit 3.0 attack on financial services firm. 847 endpoints encrypted, $2.8M ransom. Full kill chain traced through 15 events. $0 in policy changes would have prevented $20M+ in damages.

🔍 Forensic & Compliance

PwC: Procurement Fraud

Forensic

$4.7M fictitious vendor scheme. Shell company, structured transactions below bid threshold, approval chain manipulation. Causal chain traces insider to shell entity in 12 events. $15K/year prevents $4.7M fraud (313x ROI).

PwC: SOX Control Failure

Audit

$38M revenue misstatement from cascading control failures in ASC 606 recognition. Material weakness traced through 10 events. $85K/year in controls prevents $305M total economic impact (3,588x ROI).

PwC: Response to Demo Feedback

Discussion

Point-by-point response to PwC's three objections after the initial demo. Covers: determinism vs LLM probabilism, detective vs preventative framing, and the integration path for existing SIEM/GRC tooling. Includes the reframed demo approach and next steps.

⚖️ Legal

Legal — Product Liability

Litigation

Defective hip implant. Suppressed metallurgy report establishes "but for" causation proof for product liability.

Criminal Case Analysis

Legal

State v. Howell drug case. 12 Rungs calls: proximate cause (3×Yes), modus ponens validity (3×Valid), ATE=+0.87, plea value=+0.30. Mode 2 live architecture.

State v. Howell — Full Report

Legal

Thorough causal analysis of CR100629. 5 scenarios: manufacturing→endangerment, school zone charge elevation, detention→plea, counterfactual sentencing (school zone added 10–15 months), modus ponens. Oregon sentencing grid visualization.

Legal Causal Benchmark — LLM vs Rungs

Benchmark

5 causal questions from CR100629 answered by GPT-4o, Claude 3.5 Sonnet, Gemini 1.5 Pro, and Rungs. LLMs hedge, hallucinate the sentencing grid, or conflate statutory elements. Rungs computes. Side-by-side output comparison with verdict on each question.

🏭 Industry Verticals

Healthcare

Clinical

Medication error causal chain leading to acute kidney injury. Lab alert + CPOE check ($40K/yr) prevents $2M adverse events.

Financial Intelligence

Finance

Insurance fraud detection via impossible causal chains. FAIR risk reduced from $4.2M to $800K through targeted controls.

Manufacturing

Industrial

Semiconductor defect root cause. Correlation said HVAC; causation found CMP misalignment. 1,470x ROI.

Drug Discovery

Pharma

CYP2C9 pathway drug interaction traced. $15K database check prevents dangerous Warfarin-Fluconazole bleeding events.

Climate & Energy

Energy

Texas grid failure. 80% gas infrastructure, not wind. $300M fix prevents $200B loss. Causal attribution overturns political narrative.

Northeast Grid Cascade Failure

Energy

345kV transmission cascade traced to a single Zone 2 relay misconfiguration 6 months prior — not the initial fault. NERC CIP compliance gap → N-1 contingency becomes N-2 blackout. Counterfactual: correct relay config prevents full cascade. $1.2B loss avoided.

Education Policy

Policy

Program evaluation. $800K breakfast program (8% budget) outperformed $4.2M software (42%). Causal isolation prevents $4.2M misallocation.

Investigations

Forensic

Follow-the-money corruption analysis. $350K donations traced to $134M in government contracts. 383x corruption ROI exposed.

Autonomous Vehicles

Automotive

Sensor fusion failure. Camera confidence 0.43 counted as vote instead of abstention. V2V intervention proves collision prevention.

🔬 Research & Benchmarks

99.15% CLadder · 42 benchmarks · Full technical proof.

📊 Benchmarks & Proofs

Math Foundations & Results

Research

12 mathematical fields · 20+ named formulas · 7 linguistic frameworks · 98.8% accuracy on 7,733 problems · beats GPT-4o, o3-mini, Claude Opus 4 on every dimension.

Why LLMs Cannot Reason

Research

3 controlled proof tests. A $0 regex parser outperforms GPT-4o by 18 points. DoVerifier (ETH Zürich, EACL 2026) compared. Why Claude's explanation of Rungs is still valid even though Claude can't reason.

Rungs vs. DoVerifier — Deep Report

Research

13-section technical comparison against ETH Zürich's EACL 2026 paper. Architecture, benchmarks, complexity, NL parsing gap, the 3 tests DoVerifier cannot pass, and what the best LLM scaffolding result actually proves.

Benchmark Report — 99.15%

Technical

Full technical report: how we improved from 98.6% to 99.15% with three-pass verification, complete CLadder breakdown by rung and query type, LLM cost tables, scaling cost matrix (1K→1M users), and independent verification instructions.

New Capabilities — March 2026

Engineering Report

Competitive landscape analysis of the entire Pearl do-calculus open-source ecosystem, three capability gaps identified, and the engineering report showing how they were built: large graph support (variable elimination), causal discovery (PC algorithm), and hidden confounders (backdoor/front-door/ID algorithm). 15/15 tests pass. Zero external dependencies.

🔬 Understanding Rungs

Why Causation Matters

Executive

For leaders and strategists. The 30-year unsolved problem in business intelligence. Why every major decision is a causation question — and why every current AI tool answers it wrong. No math required.

Rungs Explained

Plain English

How Rungs works, what the 99.15% score actually means, and what it costs to deploy at scale — explained without jargon. The 3 questions standard AI cannot answer. Real examples across 4 industries.

Build Log & Edge Roadmap — What's New + What's Coming

★ New

Mark's build log: Causal Red Team just shipped (5-attack adversarial stress-test, grade A/B/C/D), full list of what's in the engine, 2 features in progress, and 8 edge roadmap features that aren't in any open-source tool today — Transportability, Fairness Auditing, Network Interference, Federated Causal Inference, and more.

Rungs Engine — Full Technical Capabilities

Review Package

Complete technical review package for Brian and Scott. 80-assertion test suite with all DGPs embedded. Every claim is independently verifiable: grep the imports, run the tests, check the architecture. Includes Causal Red Team — a 5-attack adversarial stress-test that grades any ATE A/B/C/D. No trust required. Clone the repo and run python3 test_all.py.

Architecture Brief — Standalone vs. Hybrid

Internal Brief

Addresses whether Rungs depends on Claude, what minimum LLM the NLP layer needs, and whether building a custom LLM makes sense. Includes live training status: 7,736 examples, fine-tuning Qwen2.5-1.5B with LoRA right now on Mac M-series. After training: $0/query, no API, full Rungs IP end-to-end. Covers two modes, 32-point benchmark lead over GPT-4, build-vs-borrow analysis, and full claim-by-claim verdict.

🧠 Cognitive Theory

💰 Products

Revenue-generating applications built on the Rungs engine.

⚖️ Civil Rights Intelligence — Vantage × MyGuardian

Vantage — Civil Rights Interactive Demo

Password Gated

Live interactive demo: 3 encounter scenarios (traffic stop, street stop, enterprise liability). Real-time decision guidance, risk ranking, and locked evidence analysis preview. Built for the MyGuardian partnership pitch. PW: vantage2026

Vantage × MyGuardian — Partnership Deck

13 Slides

Full partnership pitch deck: problem, product, market size ($170B+ TAM), revenue model (bear/base/bull), 4-phase integration plan, risk analysis, and next steps. Designed for the MyGuardian founder meeting.

💰 Consumer Recovery — 8 Live Sites

Consumer Recovery — Portfolio Overview

8 Domains

8 registered domains targeting $30B+ in unclaimed consumer money per year. Each site delivers a free probability analysis before payment. Verticals: security deposit, medical billing, chargebacks, airline claims, package recovery, and the hub at recoveryourfunds.us.

🗺️ Strategy & Roadmap

Where we're going and how we get there.

Technical Brief — Parser, Hallucination & What Rungs Actually Is

For Scott & Brian

Addresses the regex/NL parser question directly. Explains the CLadder 96.96% result and what it means for hallucination. Clarifies the architecture — what the parser is vs. what Rungs is — and where to focus.

Rungs Roadmap — What We're Building

Living Doc

10 shipped · 5 in progress · 15 planned. Two tracks: Rungs engine improvements and consumer recovery sites (8 domains). Updated as work ships.

Consumer Recovery — Go-To-Market Strategy

Strategy

Who to partner with for each of the 8 recovery domains. Partner analysis by vertical (packages, chargebacks, medical, flight, deposits), deal structures, priority matrix, and organic SEO play.

Dayflow — Product & Market Analysis

New App

AI-powered day/week/month organizer built on Rungs' causal reasoning engine. Understands task dependencies, cascades schedule changes, anchors to goals. $58B market, 200M target users. Thorough analysis: product vision, competitive landscape, revenue model, GTM strategy, market projections.

TunedAI — LLM Tuning as a Service

🔥 Marketing Roadmap

We proved it: Rungs-supervised fine-tuning takes a small LLM from 50% → 93.4% — within 1.4 points of human expert, beating GPT-4o by 16 points. The business: companies bring their model, we return turbocharged weights. Marketing roadmap, pricing, GTM strategy, and the Crazy Eddie video script.

TunedAI — Partner & Investor Brief

Plain English

The simple story: what we built, what the numbers mean, how the business works. Written for Scott and Brian — no jargon, no assumptions. Analogies: car tuning + teenager becoming an adult. Includes pricing, how a deal works, FAQ.

VAULT-AI CORP™ — Turbocharge Your Wasteland AI

🎮 Fallout Style

Marketing site concept in Fallout / Vault-Tec aesthetic. Green phosphor terminal, CRT scanlines, S.P.E.C.I.A.L. stats, before/after benchmark comparison, Wasteland Leaderboard, testimonials from Diamond City, and the full pricing table. "War never changes. But your AI can."