When "humans review the PR" hits its ceiling in the AI era — what comes next.
ai-desk methodology — Hiroyuki OKINOI · 2026-05
The doubt is correct — under the assumption that "humans must read the code", AI parallel development does not scale.
So what happens if we drop that assumption?
The gap between AI generation rate and human reading rate widens every year.
The moment "carefully review every PR" stays mandatory, total throughput is rate-limited by human reading capacity. Adding more reviewers just means multiple humans reading the same PR.
Even reviewed-by-humans code ships vulnerabilities.
"Humans review" is neither necessary nor sufficient. The basis for confidence is weaker than it feels.
Review is "better than nothing" but not "review = safe". Whether AI or human writes, review alone guarantees little.
What humans give up: chasing lines of code. What humans keep: "what to build" and "is it working".
The human touches only at the start and the end. The 3 middle stages are entirely machine.
Tools shipped in ai-desk:
ai-desk.js (code editing) · ai-eyes.js (screen observation / input injection / video record) · eyes-e2e.js (state → text).
Human load becomes O(intent count), no longer dependent on code volume.
This is the structure that makes "single-developer full-time parallel execution" work.
Code written without reading — surely you can't fix it 6 months later?
The fixer is also AI. As long as the original intent and the E2E tests are preserved, AI can re-read and patch.
ai-desk's tool for this is Emblem-scoped editing (skeleton / focus / apply).
Even in a 100k-line file, AI only reads the relevant emblem to make the change.
Premise: AI doesn't go away. With multi-vendor competition + local models maturing, this is structurally guaranteed rather than wishful.
A new member who can't read code can't join the project, right?
The handoff material = intent declarations + working E2E tests + screenshots + video.
The new member reads "what does this service do", not "how is it implemented", and asks AI for changes.
Compared with "read the code and infer the spec", seeing intent and observed behaviour directly is faster onboarding.
Claim: if it works for an individual, it works for a team — same method, same primitives.
Without reading, you'll never notice when AI does something weird.
"Reading" is subjective inspection. Sometimes it catches things, sometimes it doesn't (Heartbleed: 2 years dormant).
ai-desk relies on objective inspection:
・Exhaustive E2E (try all 1920 worlds)
・Twin (double-entry math) verification (re-compute GPU output with a pure CPU function)
・Event sourcing + hash chain (mathematical tamper detection)
These are stronger guarantees than "a human read it".
Action-game cancel/combo logic, exhaustively tested across 1920 worlds.
What the human reviews = "all 1920 worlds passed". The contents of individual worlds are not read.
"Mechanically tried everything" is a stronger guarantee than "a human read part of it".
Don't give up on "reading the code".
Replace it with what machines can guarantee.
This isn't a "technique improvement", it's a role-split redesign.
In an era where AI writes the code, narrow the human's responsibility to intent and verification.
The moment that happens, however much code grows, the human's load only grows with intent count.
github.com/AoyamaRito/ai-desk
Hiroyuki OKINOI · Aoyama Rito