The ORBIT methodology in practice — where theory meets execution
The test of any methodology is not how elegant it sounds — it's what happens when real teams use it on real problems.
CHAPTER THESIS: ORBIT — Orchestrated Reliable Bounded Intent Tasks — is the integrated methodology that combines everything from Essays II and III into a working system. It's not a framework to study. It's a practice to adopt.
Each word in ORBIT carries weight:
| Component | Meaning | Why It Matters |
|---|---|---|
| Orchestrated | The AI coordinates complexity on behalf of the human | You direct; the system executes across parallel streams |
| Reliable | Glass Box transparency + audit trails + bounded autonomy | Enterprise-grade trust, not startup-grade hope |
| Bounded | Mission documents define the playing field | Maximum exploration within defined constraints |
| Intent | Natural language is the interface — state what you want | No translation layer between thought and action |
| Tasks | Everything decomposes into executable, measurable units | Progress is always visible, always traceable |
ORBIT is what you get when the pilot model (Chapter 7), the Mission Cockpit (Chapter 8), the view system (Chapter 9), living documents (Chapter 10), the architecture (Chapter 11), and the simplicity thesis with its lens mechanism (Chapters 12–16) work together as a single integrated system. In the language of Anton Korinek's research on transformative AI, when complex cognitive tasks collapse into executable functions, entire economic sectors restructure around the new cost landscape. ORBIT is the methodology that enables this restructuring at the team and enterprise level.
Large Language Models are extraordinarily capable — but they have three fundamental limitations that prevent naive use from producing enterprise-grade results. ORBIT's architecture is designed specifically to address each one.
The ORBIT architecture addresses these limitations not through a single technique but through a layered system of complementary strategies — each targeting specific failure modes, and together creating a compound defence that produces reliable, coherent, enterprise-grade output.
Massive Decomposition is the foundational strategy. Rather than feeding a large, complex task into a single LLM context window, ORBIT decomposes it into small, focused subtasks — each of which fits comfortably within the model's effective attention range. The MDAP framework (Meyerson et al., 2025) provides the mathematical proof: by decomposing tasks into atomic subtasks and applying multi-agent voting at each step, their system MAKER completed over one million sequential steps with zero errors — using relatively simple, non-reasoning models. The intelligence came from the system design, not the model. Smaller, well-defined tasks produce dramatically better results than large, ambiguous ones — because the model can attend fully to the information that matters.
Progressive Disclosure ensures the model receives only the information it needs for the current subtask — not everything at once. This is the principle behind Retrieval-Augmented Generation (RAG): instead of loading an entire knowledge base into context, the system retrieves relevant fragments on demand. ORBIT extends this through pull-on-demand architecture: agents request specific information when they need it, rather than having everything pushed upfront. This reduces context overhead by orders of magnitude while ensuring the model works with clean, relevant inputs. Anthropic's donation of the Model Context Protocol (MCP) to the Linux Foundation's Agentic AI Foundation in December 2025 — with OpenAI, Google, Microsoft, and AWS as co-founders — effectively standardised this pattern as an industry norm. MCP now has over 10,000 active public servers, confirming that pull-on-demand context management is the architecture the industry is converging on.
Server-side prompt caching eliminates the cost and latency of repeatedly sending the same foundational context (system prompts, mission documents, architectural standards). Cached prompts reduce costs by up to 90% and latency by up to 85% — making it economically viable to maintain rich, persistent context across thousands of agent interactions without degrading performance.
Parallelisation and voting runs multiple agents on the same task and uses consensus mechanisms to select the best output. Wang et al.'s research on self-consistency (2023) demonstrated that sampling multiple reasoning paths and returning the majority answer significantly reduces errors — because incorrect hallucinations are unlikely to be consistent across independent runs. ORBIT applies this principle through parallel agent execution with structured voting, ensuring that confident-sounding errors are caught by disagreement.
Specialist tools remove entire categories of hallucination by giving agents access to external verification: code execution engines that test whether generated code actually works, search tools that verify facts against sources, databases that confirm data against ground truth, and file system access that checks whether referenced files exist. Schick et al.'s Toolformer research (2023) showed that a 6-billion parameter model with tools achieves performance competitive with much larger models — because tools provide the factual grounding that generation alone cannot. Anthropic's 2025 release of advanced tool-use capabilities takes this further: Tool Search allows agents to access thousands of tools without consuming the context window (solving the tool-definition bloat problem), while Programmatic Tool Calling enables agents to invoke tools within a code execution environment — reducing context overhead while increasing precision. Both innovations validate the principle that agents should discover and use tools on demand, not carry the weight of every possible tool in their context.
A panel of AI agent experts — multiple specialised agents with different roles and system prompts — provides domain-specific validation. A security-focused agent reviews for vulnerabilities. An architecture agent checks for consistency. A testing agent verifies functionality. When agents with different perspectives converge on the same answer, confidence is warranted. When they disagree, the system flags the output for human review. A 2025 study on multi-agent LLM orchestration for incident response quantified the advantage: orchestrated multi-agent systems achieved 100% actionable recommendation quality compared to 1.7% for single-agent systems — an 80× improvement in specificity and 140× improvement in correctness, with zero quality variance across all trials. The evidence is clear: specialised agents working in concert produce results that no single agent can match.
Mission-bound structured documents are the most fundamental hallucination defence. When the model works within explicit constraints — product specifications, architectural schemas, interface contracts, design standards — the space of valid outputs is dramatically constrained. The model cannot hallucinate a database schema that contradicts the one defined in the mission document. It cannot generate code that violates the architectural patterns specified in the bounded artifacts. Research on controlled generation pipelines confirms that explicit constraints are the most reliable method for reducing hallucination in production systems.
Goal-seeking autonomous loops — what ORBIT calls Ralph Loops — are agents that iterate toward a defined success criterion, checking their own output against explicit goals at each step. Based on the Reflexion framework (Shinn et al., NeurIPS 2023), which demonstrated 20-22% improvements on reasoning and decision-making tasks, Ralph Loops ensure that each iteration moves closer to the mission — not further from it. The agent doesn't just generate output; it evaluates whether that output serves the objective, and self-corrects when it doesn't. A 2025 paper published in Nature on dual-loop self-reflection — inspired by metacognition — provides further validation: LLMs that critique their own reasoning against reference responses through iterative reflection cycles show measurably improved accuracy and coherence. The pattern is the same one visible in Knuth's Claude Cycles case study: 31 explorations where each failure was documented, analysed, and used to refine the next attempt. Self-reflection is not a luxury. It is the mechanism that turns iteration into learning.
Structured planning requires agents to create an explicit plan before executing — then follow the plan step by step, checking progress against milestones. Wei et al.'s chain-of-thought research (2022) and Yao et al.'s Tree of Thoughts (NeurIPS 2023) demonstrated that planning before acting improves performance from 4% to 74% on complex tasks. ORBIT applies this at every level: mission-level plans decompose into sprint-level plans, which decompose into task-level plans, each with explicit success criteria.
Agent delegation and orchestration maintains coherence across scale through hierarchical coordination. A coordinator agent holds the high-level mission context and delegates specific subtasks to specialist agents — each working within bounded scope but contributing to the unified objective. Research from Microsoft and IBM confirms that hierarchical architectures are the only viable pattern at scale (50+ agents), precisely because they maintain goal alignment that flat architectures lose.
Agent swarms — large numbers of agents working in parallel on different aspects of a problem — extend this further. Each agent in the swarm processes a portion of the work, communicating through lightweight event-driven mechanisms. Distributed consensus across the swarm reduces individual agent errors, while the orchestration layer ensures all outputs converge toward the mission. The swarm doesn't replace human judgment — it amplifies it, executing at a scale and speed that no individual could achieve while remaining tightly bound to mission-defined objectives.
| Limitation | ORBIT Technique | Research Basis |
|---|---|---|
| Context Degradation | Massive decomposition | MDAP/MAKER: 1M+ steps, zero errors (Meyerson et al., 2025) |
| Progressive disclosure / pull-on-demand | RAG research; tool-use reduces context 99% | |
| Server-side prompt caching | 90% cost reduction, 85% latency reduction | |
| Hallucination | Parallelisation & voting | Wang et al. self-consistency (2023) |
| Specialist tools | Schick et al. Toolformer (2023) | |
| Panel of agent experts | Multi-agent debate research | |
| Mission-bound structured documents | Controlled generation pipelines | |
| Loss of Coherence | Ralph Loops (goal-seeking iteration) | Shinn et al. Reflexion, NeurIPS 2023: +22% |
| Structured planning | Yao et al. Tree of Thoughts: 4% → 74% | |
| Agent delegation & orchestration | Microsoft/IBM hierarchical patterns | |
| Agent swarms | Distributed consensus reduces errors |
No single technique solves all three limitations. The power of the ORBIT architecture is in the combination — a layered defence where each technique compensates for the weaknesses of others. Decomposition keeps context clean. Voting catches hallucinations. Planning maintains coherence. Tools provide grounding. And structured mission documents anchor everything to a clear, verifiable objective. The result is not a perfect system — no AI system is — but a system whose failure modes are visible, bounded, and correctable. That is the difference between enterprise-grade and prototype-grade AI: not the absence of errors, but the architecture to detect and recover from them.
Pilot opens cockpit. The AI summarises overnight agent activity: "3 experiments completed. 2 passed tests. 1 needs review."
Pilot reviews the failed experiment. Glass Box shows exactly what happened and why. Decision: adjust the approach, not the goal.
Morning brainstorm with the AI. "What's our highest-ROI opportunity today?" The AI synthesises across codebase health, user feedback, and the product mission. Recommends 3 options with estimated impact.
Pilot selects direction. 20 agents begin parallel execution. The pilot moves to strategic work — reviewing architecture decisions, refining the mission document.
Midday check: 4 agents have completed tasks. Glass Box shows all work, all decisions, all evidence. Pilot approves 3, requests revision on 1.
New hypothesis emerges from pattern recognition: "Users in the healthcare vertical spend 3x more time in the analytics view. Consider deepening this for the next sprint."
End of day: work that would have taken a 10-person team two weeks completed in hours. Every decision traceable. Every outcome measurable against the mission.
Marketing director opens cockpit. The AI reports: "Campaign A outperforming by 23%. Competitor X launched a new positioning. Three content opportunities identified."
Experiment initiated: "Test enterprise messaging vs. SMB messaging for the Q2 campaign." AI sets up parallel content streams, audience segments, and measurement frameworks.
Glass Box shows real-time campaign performance across all channels — email, social, paid, organic — in one view. No switching between Mailchimp, HubSpot, Google Analytics.
AI surfaces pattern: "Customers who engage with technical content convert at 2.3x the rate of those who engage with business content. Recommend increasing deep-dive content allocation by 30%."
End of day: one person has managed what previously required a content strategist, data analyst, campaign manager, and social media coordinator. All aligned to a single mission.
CEO opens cockpit. Lens: CEO + Real-time + All Functions. "Revenue tracking 8% above plan. Engineering velocity is up 40% since ORBIT adoption. Customer churn risk flagged for 3 accounts."
Drills into churn risk. Glass Box shows the data trail: support tickets up, product usage down, competitor mentioned in 2 support calls. AI recommends: "Executive outreach within 48 hours. Success probability: 72% if actioned this week."
Switches lens: CEO + Predictive + Financial. "Based on current trajectory, Q3 will exceed target by 12%. However, hiring plan creates cash flow pressure in Q4. Three scenarios modelled."
Board preparation: AI synthesises across all functions into a board-ready summary. What used to take a week of cross-functional data gathering happens in minutes.
ORBIT comes to life through a cyclical workflow that embodies the Centaur principle: human and AI working as an amplified team, each contributing what they do best. This is not a linear process — it is a loop that returns to brainstorming whenever deeper understanding is needed.
Human + AI explore ideas, challenge assumptions, and generate structured artifacts: architecture diagrams, product specs, mockups, entity models, process flows, design themes. The AI surfaces patterns, alternatives, and research. The human provides direction, judgment, and domain insight.
Human + AI evaluate the brainstorm artifacts against the Four Cs (defined in detail in The Enterprise): Concise (no unnecessary complexity), Complete (nothing critical is missing), Correct (accurate and sound), Clear (unambiguous to both human and machine). Both must be satisfied before moving forward. This is quality control at the input stage — because quality in determines quality out.
AI executes: writes code, generates configurations, produces deliverables, runs commands — all bounded by the agreed artifacts. If errors or scope changes arise, the workflow returns to Brainstorm. The human steers; the AI builds at speed.
Human + AI review the output: test results, visual inspection, functional checks. If it meets the standard — Done. If it needs refinement, the workflow loops back to Brainstorm or Build. Every iteration improves the shared understanding.
The cycle completes, or returns to Brainstorm to refine, extend, or explore the next opportunity. Every cycle produces deliverables and deepens understanding.
The power of this workflow is in the brainstorm artifacts. These are not casual notes — they are structured outputs that capture the shared understanding between human and AI: architecture diagrams, sequence flows, entity models, interface specifications, product requirements, design mockups, process definitions. Each artifact becomes a reference point that grounds subsequent work. When the AI builds, it builds from an artifact that both parties agreed on. When the human verifies, they verify against a specification that was collaboratively produced. The artifacts are the contract between human intent and AI execution.
The Four Cs — Concise, Complete, Correct, Clear — are the quality gate that makes this work. (The Four Cs are explored in depth in The Enterprise, where they are applied to contracts, APIs, and business relationships as the principle that enables complexity to collapse at boundaries.) Traditional software development suffers from ambiguous requirements that cascade into rework. The Centaur Workflow inverts this: invest in clarity at the brainstorm stage, and the build stage becomes dramatically faster and more accurate. Quality in produces quality out. Vague input produces vague output. The discipline of satisfying the Four Cs before building is what separates AI-amplified work from "vibe coding" — where speed without shared understanding produces fragile, unmaintainable results.
The workflow is cyclical, not linear. You can return to brainstorming from any phase. A failed verification triggers a deeper brainstorm. A scope change during build sends you back to agree on updated artifacts. This is the scientific method applied to building: hypothesise (brainstorm), agree on the experiment (artifacts + Four Cs), execute (build), observe results (verify), and learn. Every cycle compresses. Every iteration sharpens the shared understanding between human and AI. This is the Centaur at work — and it is how the ORBIT architecture produces enterprise-grade results.
In February 2026, Donald Knuth — widely regarded as the founder of algorithmic analysis and author of The Art of Computer Programming — published a paper titled "Claude's Cycles" describing how an open mathematical problem he had been working on for weeks was solved through Human + AI collaboration. The problem involved decomposing directed Hamiltonian cycles in high-dimensional digraphs — a challenge that had resisted Knuth's own efforts.
Filip Stappers, a colleague, posed Knuth's exact problem to Claude Opus 4.6, then guided it through 31 numbered "explorations" — iterative cycles where the AI would hypothesise an approach, write code to test it, evaluate results, fail, reframe, and try again. The workflow mirrors the Centaur model precisely: the human provided direction, persistence, and coaching; the AI provided computational exploration, pattern recognition, and creative mathematical reasoning.
Critical to the process was a structured artifact discipline. Stappers instructed Claude to update a plan.md file after every single exploration — "No exceptions. Do not start the next exploration until the previous one is documented here." This is mission-bound documentation in action: persistent artifacts that prevent goal drift across extended interactions.
The explorations revealed the limitations and the power simultaneously. Claude tried linear functions, brute-force search, simulated annealing, fiber decompositions, and serpentine patterns — most of which failed. At exploration 26, it reframed the problem entirely: "Maybe the right framing is: don't think in fibers, think directly about what makes a Hamiltonian cycle." This creative leap — emerging from the pressure of 25 failed attempts — led to the breakthrough at exploration 31. Stappers also noted that Claude hit context limitations: "After every two or three test programs were run, he had to remind Claude again that it was supposed to document its progress carefully." The human compensated for the AI's context degradation in real time.
The result: a valid decomposition for all odd values of m, verified computationally up to m = 101, with a formal proof subsequently constructed. Knuth wrote: "I think Claude Shannon's spirit is probably proud to know that his name is now being associated with such advances. Hats off to Claude!" He also noted: "It seems that I'll have to revise my opinions about 'generative AI' one of these days."
Neither Knuth nor Claude could have solved this alone. Knuth had the mathematical intuition but couldn't explore the computational search space fast enough. Claude had the computational power but needed human guidance to stay on track, recover from context loss, and recognise when a reframing was needed. The 31 explorations — each documented, each building on what came before — are a textbook demonstration of the Centaur Workflow: brainstorm, build, verify, iterate. The human steers. The AI explores at scale. The structured artifacts hold everything together.
In November 2025, Meyerson et al. at UT Austin and Cognizant AI Lab published "Solving a Million-Step LLM Task with Zero Errors" — a paper that provides the mathematical proof for why the ORBIT architecture works at scale. The paper introduces the MDAP framework (Massively Decomposed Agentic Processes) and demonstrates something that should be impossible: completing a task requiring over one million sequential LLM steps with zero errors.
The problem is fundamental. Even a highly capable LLM with a 99% per-step accuracy rate will fail catastrophically on long tasks — a 1% error rate compounds to near-certain failure after just 100 steps. At 1,000 steps, success is essentially zero. At one million steps, it is mathematically impossible with a single agent. The researchers tested state-of-the-art models on the Towers of Hanoi benchmark and confirmed this: performance degrades catastrophically after five or six disks, with success rates plummeting to zero.
Their solution was not to build a smarter model. It was to build a smarter system. The MDAP framework rests on three principles that map directly to ORBIT's architecture:
1. Maximal Agentic Decomposition — break every task into the smallest possible atomic subtasks, each assigned to a focused "microagent." Each agent sees only the information it needs for its single step, eliminating context degradation entirely. When steps are small enough, even relatively simple models achieve near-perfect accuracy on each one.
2. Multi-Agent Voting — for each subtask, multiple agents independently generate solutions. A first-to-ahead-by-k voting protocol selects the answer that achieves consensus. The paper proves mathematically that if a single agent's per-step accuracy p exceeds 0.5, the voting process can amplify reliability to arbitrarily high levels. Errors don't accumulate because they are caught and eliminated at every step.
3. Red-Flagging — outputs whose structure suggests increased error risk (such as formatting anomalies or internal inconsistencies) are discarded before they enter the voting pool. This catches correlated errors that voting alone might miss.
The result: their system MAKER (Maximal Agentic decomposition, first-to-ahead-by-K Error correction, and Red-flagging) completed over one million steps of the Towers of Hanoi — a task requiring 220 − 1 = 1,048,575 sequential moves — with zero errors. The base LLMs used were not frontier reasoning models. They were relatively small, non-reasoning models. The intelligence came from the system design, not the model.
The paper's conclusion is the thesis of the ORBIT architecture stated in formal terms: "Instead of relying on continual improvement of current LLMs, massively decomposed agentic processes may provide a way to efficiently solve problems at the level of organizations and societies." Reliability at scale comes from structure and decomposition, not from model intelligence. You don't need a perfect AI. You need a perfect system design.
These two case studies — Knuth's Claude Cycles and the MDAP framework — illuminate the two complementary dimensions of the ORBIT architecture. Knuth demonstrates that human + AI collaboration, through structured iteration and documented artifacts, solves problems that neither can solve alone. MDAP demonstrates that architectural design — decomposition, voting, and error detection — achieves reliability at scale that no single model can match. ORBIT combines both: the Centaur Workflow for creative discovery, and the MDAP-informed architecture for reliable execution. The human steers. The system scales. The artifacts hold it together.
ORBIT isn't a project management methodology. It's a value discovery engine powered by the Centaur Workflow — a cyclical collaboration between human judgment and AI capability — built on an architecture where reliability comes from system design, not model intelligence. The brainstorm artifacts are the contract. The Four Cs (Concise, Complete, Correct, Clear) are the quality gate. Decomposition, voting, and mission-bound constraints eliminate the compounding errors that make naive AI use fail at scale. The question shifts from "How do we execute this plan?" to "What do we need to learn, and how fast can we learn it?" When cycle time drops from weeks to hours, every hypothesis becomes testable, every assumption becomes verifiable, and every opportunity becomes explorable.
The value isn't in any single feature — it's in what happens when everything works together.
CHAPTER THESIS: Individual features deliver incremental improvement. An integrated system delivers compound transformation. The complete value picture is exponential, not additive.
| Capability | Standalone Value | Integrated Value |
|---|---|---|
| AI assistant | Faster individual tasks | — |
| + Mission alignment | Tasks aligned to goals | Direction + speed |
| + Transparency | Visible AI reasoning | Trust + speed + direction |
| + Multiple perspectives | Different stakeholder views | Alignment + trust + speed + direction |
| + Safe experimentation | Bounded parallel exploration | Learning + alignment + trust + speed |
| + Pattern recognition | Emergent insight across data | Innovation + learning + alignment + trust + speed |
| = ORBIT | — | The compound exceeds the sum by orders of magnitude |
This is the integration premium: each capability amplifies the others. Transparency makes experimentation trustworthy. Safe experimentation makes Living Documents adaptive. Living Documents make mission alignment dynamic. Mission alignment makes pattern recognition relevant. Pattern recognition feeds back into better hypotheses for the next experiment.
Recall from Chapter 6: Total Complexity = Σ(Mission Complexities) + Σ(Interface Costs)
The complete ORBIT system attacks both terms simultaneously:
Mission Complexities: High — fragmented understanding
Interface Costs: Massive — 130+ tools, siloed teams
Total Complexity: Overwhelming
Mission Complexities: Reduced — clear Commander's Intent
Interface Costs: Near zero — one cockpit, one AI
Total Complexity: Manageable → Collapsing
When interface costs approach zero, something remarkable happens: the system's natural complexity becomes the only complexity. And natural complexity — the inherent difficulty of the problems you're solving — is the complexity you want. It's where the value lives.
An AI chatbot makes you faster. A mission-aligned, transparent, lens-equipped, experiment-capable, discovery-enabled cockpit makes you fundamentally different. The complete value picture isn't "do the same things faster" — it's "do entirely different things that were previously impossible."
You can manufacture more of anything except time. Which means time waste is the only truly irreversible loss.
CHAPTER THESIS: Time is the one resource that can't be manufactured, stored, or recovered. The Productivity Supernova returns time to humans by eliminating the waste embedded in fragmented, complex systems.
Every enterprise process carries a hidden time tax — time consumed not by the work itself but by the complexity surrounding the work:
| Process | Actual Work Time | Complexity Time Tax | Total Time | Tax Rate |
|---|---|---|---|---|
| Software feature | 2 days coding | 8 days (meetings, reviews, deployment) | 10 days | 80% |
| Marketing campaign | 3 days creative | 12 days (approvals, coordination, assets) | 15 days | 80% |
| Sales proposal | 1 day writing | 4 days (research, pricing, legal review) | 5 days | 80% |
| Financial close | 2 days reconciliation | 8 days (data gathering, verification) | 10 days | 80% |
| Hiring decision | 1 day interviews | 19 days (sourcing, scheduling, consensus) | 20 days | 95% |
The pattern is striking: across functions, the complexity time tax consistently consumes 80% or more of total process time. The actual valuable work is a fraction of the elapsed time.
The Software Development Life Cycle provides the most documented evidence of time collapse:
Requirements → Design → Build → Test → Deploy → Monitor
2 weeks + 2 weeks + 4 weeks + 2 weeks + 1 week = 11 weeks
Intent → AI translates → Agents build → Glass Box validates
1–3 days total = 90%+ compression
This isn't theoretical. Teams using AI-assisted, mission-aligned development workflows are demonstrating 10-50x compression of traditional timelines — not by cutting corners but by eliminating the coordination overhead, context switching, tool navigation, and waiting that constituted the vast majority of elapsed time.
The same compression applies to every enterprise function once complexity collapses:
| Enterprise Process | Traditional Timeline | Post-Collapse | Time Returned |
|---|---|---|---|
| Quarterly business review | 3 weeks preparation | Real-time (always ready) | 3 weeks |
| Competitive analysis | 2 weeks research | 2 hours (AI synthesis) | ~2 weeks |
| Compliance audit | 4 weeks | Continuous (automated) | 4 weeks per cycle |
| Customer 360 report | 5 days (cross-system data) | Instant (unified cockpit) | 5 days |
| Strategic planning cycle | 6 weeks | 1 week (AI-modelled scenarios) | 5 weeks |
| New employee onboarding | 3 months to productivity | 3 weeks (AI-guided) | 10 weeks |
McKinsey research shows knowledge workers spend an average of 8.2 hours per week searching for information that already exists somewhere in the organisation. That's over 400 hours per year per person — 10 full work weeks — consumed entirely by complexity. A unified Knowledge Fabric eliminates this waste completely.
The Productivity Supernova doesn't just make processes faster — it returns time to humans. And unlike cost savings that show up in spreadsheets, returned time compounds. An engineer who gets 6 hours back per day doesn't just write more code — they think more deeply, design more carefully, and discover opportunities they never had time to notice.
$4.3 trillion in unmet human needs. Not because we lack intelligence, but because complexity made serving those needs uneconomical.
CHAPTER THESIS: The Productivity Supernova doesn't just make existing work faster — it makes previously impossible work possible. The market expansion that follows is not incremental but explosive.
In 1865, economist William Stanley Jevons observed something counterintuitive: as steam engines became more efficient, coal consumption increased. The cheaper energy became, the more uses people found for it.
This principle — Jevons Paradox — predicts what happens when AI collapses the cost of intelligent work:
AI gets more efficient
Cost of intelligent work drops
More use cases become viable
Total demand for intelligent work increases
More human roles needed (directing, judging, creating)
Net employment grows
AI inference costs have dropped over 280-fold in 18 months. Yet combined hyperscaler capital expenditure for AI infrastructure is projected to reach $602 billion in 2026 — a 36% increase. Cheaper AI creates more AI use, which creates demand for more AI infrastructure. Total hyperscaler capex from 2025-2027: projected $1.15 trillion.
When building capacity multiplies by 1000x, markets that were previously uneconomical emerge:
| Market Category | Why It Couldn't Exist Before | Size/Trajectory |
|---|---|---|
| Custom enterprise software | Too expensive for SMBs | Previously $30M → now <$1M (Inc. Magazine) |
| Personalised education | Required 1:1 tutoring at scale | EdTech projected $1.28T by 2034 |
| Rural telemedicine | Infrastructure + specialist costs | 2 billion people without healthcare access |
| Micro-SaaS for niche markets | Development costs exceeded market size | Print-on-demand: $10.2B → $103B by 2034 |
| AI-native creative tools | Required human specialists | Creator economy: $191B → $480-1,490B by 2027-2034 |
The resource being "consumed" isn't labour — it's human creativity and intent. And as Jevons would recognise, the appetite for creativity is infinite.
When barriers to building collapse, entrepreneurship explodes:
What once required $30 million can now be accomplished with less than $1 million. The infinite ocean is real. ORBIT gives every fisherman a 1000x larger net.
The data dismantles the job-destruction narrative:
| Metric | Impact | Source |
|---|---|---|
| AI-assisted customer service agents | 14% more productive on average | Research |
| Least experienced workers with AI | 35% more productive | Research |
| Experience equivalence | 2 months + AI = 6 months without AI | Research |
| AI wage premium | 56% higher salaries (up from 25% prior year) | Research |
| New job categories created | AI Ethics Officers, MLOps Engineers, Expert AI Trainers ($100s/hour) | Industry data |
The pilot model embodies this: the human doesn't become obsolete — they become the most valuable component. The pilot who directs 20 AI agents toward a clear mission is worth more, not less, than they were before. And as the infinite ocean opens up, demand for human creativity doesn't shrink. It multiplies.
The fear of "AI taking all the jobs" misunderstands economics. When the cost of intelligent work drops, demand doesn't decrease — it explodes. Regional hospitals, small businesses, niche industries, and individual creators couldn't afford custom solutions before. As AI collapses costs, new markets emerge, new businesses form, and the total demand for human creativity grows. The pie doesn't shrink. It multiplies.
The hardest problem isn't building the solution — it's discovering what solution to build.
CHAPTER THESIS: Most ambitious projects fail not from poor execution but from solving the wrong problem. The methodology must match the nature of the problem — and ORBIT is purpose-built for the Complex domain where most real work lives.
Two government projects. Same era. Radically different outcomes:
| Project | Method | Budget | Result |
|---|---|---|---|
| Healthcare.gov (2013) | Waterfall (detailed planning) | $600M | 6 users on launch day |
| FBI Sentinel (2012) | Agile (after waterfall failed) | $99M | Completed in 12 months |
The Standish Group's CHAOS reports show agile projects succeed at nearly three times the rate of waterfall projects. Yet waterfall persists because it feels more responsible. It produces impressive Gantt charts, detailed requirements, and the comforting illusion of predictability.
The illusion is the problem: the plan assumes you already know what you need to know.
Dave Snowden's Cynefin framework reveals why different problems demand different approaches:
Cause and effect only understood in retrospect
Probe → Sense → Respond
Most software products, market strategy, customer behaviour
Cause and effect determinable through analysis
Sense → Analyse → Respond
Bridge design, accounting, known engineering
No discernible cause and effect
Act → Sense → Respond
System down, crisis response
Cause and effect obvious
Sense → Categorise → Respond
Processing an invoice, standard procedures
The critical insight: Healthcare.gov was treated as a Complicated problem (detailed planning, expert analysis, execute to spec) when it was actually Complex (unprecedented integration, unknown user behaviour, evolving requirements). The methodology mismatch was fatal.
| Question | If Yes → | If No → |
|---|---|---|
| Do we know what users want? | Complicated territory. Planning works. | Complex territory. Experiment. |
| Has this exact problem been solved before? | Analogy and best practices apply. | First principles analysis needed. |
| What's the cost of being wrong? | High → smaller experiments, more validation | Low → move faster, correct as you go |
| How stable is the environment? | Stable → longer planning horizons OK | Volatile → shorter cycles essential |
| Do we have product-market fit? | Maximise exploitation (optimise) | Maximise exploration (discover) |
The nuanced truth: Even within a single product, different components may require different approaches. Infrastructure might be Complicated (use proven patterns). User experience might be Complex (experiment continuously). A production outage is Chaotic (act first, analyse later).
ORBIT doesn't pick one methodology — it enables all of them, matched to the moment:
| Principle | Traditional Approach | ORBIT Approach |
|---|---|---|
| OODA Loop speed | 5 experiments per quarter | 50 experiments per week |
| Cost of experimentation | $50K+ per hypothesis test | Near zero (AI + agents) |
| Exploration capacity | Pick 3 directions, commit | Test 20 directions simultaneously |
| Feedback latency | Weeks to months | Hours to days |
| First principles thinking | Too expensive — settle for analogy | Affordable — question every assumption |
| Antifragile learning | Failures punished, lessons lost | Failures celebrated, lessons compounded |
When building an MVP takes hours instead of weeks, affordable-loss calculations change completely. You can try more ideas. You can question more assumptions. You can explore more of the possibility space.
Instagram pivoted from Burbn (location check-ins) to photos in 8 weeks after data revealed what users actually wanted → 1M users in 2 months
SpaceX's first three rockets crashed. The fourth succeeded. "That was the last money we had" — Elon Musk. They now secure 90%+ of international commercial launch contracts
Sean Ellis's product-market fit test: if 40%+ of users say "very disappointed" without your product, you likely have fit. Below that, keep iterating
Toyota receives over 700,000 improvement suggestions per year — and implements most of them
The question "How do you build something when you don't know what it should be?" has an answer: you build small, learn fast, and adapt continuously. You probe the Complex domain with experiments rather than trying to analyse it into submission. You match your method to your moment. ORBIT is the engine that makes this possible at 1000x speed.
The methodology is proven. How does it scale?
↓ ESSAY V: THE ENTERPRISE
Stay updated with the latest essays and insights