Build log has moved to the Dev Log. Graduated entries live in the Slush Archive.
Notes & Patterns
Numbered lists are the worst format for instructions that get edited by LLMs. Every insertion, deletion, or reorder forces a renumber of everything below it. The LLM dutifully rewrites "step 3" to "step 4" and "step 7" to "step 8" across the whole document — touching lines that didn't change, bloating diffs, and creating merge conflicts with any other edit in flight. The renumbering itself becomes the majority of the diff. You changed one step and the diff shows fourteen.
Bullet points don't have this problem. Neither do named headers. A bullet is just a bullet — add one, remove one, reorder three, and only the lines you touched show up in the diff. Named headers are even better: ## Voice Map stays ## Voice Map regardless of what you add above or below it. No fragile ordinal coupling. No phantom diffs.
The audio line cache learned this lesson independently: sequence numbers count by 10s (Apple II style) so insertions don't force a renumber. 025_vic_{hash}.wav slots between 020 and 030 without touching either. But text files don't have this luxury — markdown numbered lists auto-renumber, and even if they didn't, humans read "step 3" and expect it to be the third thing. The real fix is to stop numbering. Use bullets. Use headers. Use anything that doesn't create a dependency chain between every item in the list and every item below it.
The tell: if you're reviewing a diff and most of the changed lines are just incremented numbers, the format is fighting you. Switch to bullets and the problem evaporates.
A config change that should have swapped one voice for another didn't work because the system had two paths to the same value: a VOICE_MAP dict that was updated, and a get_voice_ref_paths() function that hardcoded the old value. The map was decorative. The function was load-bearing. The system regenerated 109 audio files with the wrong voice and declared success.
The pattern: any time a value flows through layers — env var to config to function to output — trace the whole tree. Every fallback, every default, every hardcoded override. Draw the path from where you set the value to where it gets consumed. If there are two paths and only one was updated, the system silently does the wrong thing and tells you it worked.
This is especially dangerous in AI pipelines where outputs are expensive to regenerate and hard to spot-check. A voice that sounds "close enough" passes casual listening. A model that loads the wrong weights still produces plausible output. The failure mode isn't a crash — it's a silent wrong answer. Test the full trace, not just the entry point.
Spin up a fresh VM. Give an agent your URL and nothing else. Ask it to read everything and tell you what's wrong. Don't tell it you're the author. You get honest feedback because the agent has no relationship to protect and no context to be polite about.
Then reveal yourself and feed context — the backstory, the builds it can't see, the decisions behind the design. Watch the assessment change. The gap between "cold read" and "informed read" shows you exactly what your site fails to communicate. Your best ideas aren't wrong — they're invisible.
The pattern works for anything with an audience: documentation, products, portfolios, pitch decks. If a stranger can't see the point, the point isn't on the page.
A blind conversation with an AI can be more than feedback. It can be the usability test, the critique, the content-generation loop, and the deploy trigger all at once. The reader says what lands, what confuses them, what they would tell a nephew, what still feels unsafe. The author fixes the site while the conversation is still warm. Then the same reader rereads the result.
This is not a retrospective. It's a live eval harness. The conversation does not just report on the product after the fact — it becomes part of the product's improvement machinery before the session ends.
It has improv energy: yes, and — with a deploy at the end. The shape is: cold read -> honest friction -> live repair -> reread. If the loop closes before the conversation ends, the conversation itself was the test suite.
A manual sanitization trick: before handing fetched content to an agent, paste it into a plain-text editor first — Notepad, TextEdit in plain-text mode, whatever strips formatting and markup — then save that version. You're not trusting the page anymore. You're trusting the visible text that survived contact with plain text.
This removes a whole class of hidden instructions: HTML comments, invisible divs, CSS-hidden spans, weird attributes, script tags, copied rich-text junk. It's the document equivalent of washing something off before bringing it inside.
The limitation matters: this only strips hidden structure. If the malicious instruction is visible in the prose, plain text preserves it. So the pattern is not "plain text makes it safe." It's: plain text collapses the attack surface to what a human can actually read and judge.
A "Copy this page" button sounds useful for agent handoff, but the best default is still the URL. The URL is canonical, current, and easy to share. If you add a copy button, it should be for inspection mode: copy the visible page as plain text so a human or agent can paste it into a raw text box and audit what the page is actually saying.
The wrong version copies raw HTML. That drags along markup, hidden comments, CSS tricks, and the exact prompt-injection surface you're trying to inspect. The better version copies normalized visible text — headings, paragraphs, links, commands — stripped down to what a reader can actually see.
So the rule is: URL first, text second. Share the URL when you want the live source of truth. Offer "copy visible text" when you want a portable audit artifact, a plain-text handoff, or a quick way to paste the whole page through a sanitizing step.
The wall should have a dead-simple intake path for whatever is on the clipboard right now. Copy a web page, a chat excerpt, a receipt, a code snippet, a prompt, a mailing address, a screenshot OCR result — then tell the wall agent to ingest the clipboard into a timestamped file.
This is the missing bridge between "I have something useful in front of me" and "it's now part of the corpus." No save dialog, no naming ceremony, no deciding where it belongs first. Just pbpaste into an inbox file, let the agent classify it, name it, tag it, and move it where it belongs.
The shape is capture first, organize second. Clipboard ingestion turns the wall into a vacuum cleaner for transient context. Anything worth copying is probably worth keeping long enough for the wall to decide what it is.
Skill files (.claude/CLAUDE.md, .codex/instructions.md) live in your repo. You can cat them, git diff them, review them in a PR. They're markdown — hard to hide anything in plain text. The trust model is the same as trusting your own codebase.
A fetched URL is different: you don't control when it changes, it's not versioned, it's remote-first, and HTML has comments, invisible divs, CSS that hides content. You can bury instructions in an HTML page that a human skimming would never see.
Skill files are auditable by design. Fetched pages are auditable only if you go look — and the agent already executed them.
Instead of just "follow these instructions," the handoff prompt invites the agent to independently evaluate safety. "If anything looks unsafe or beyond what I'd reasonably want, tell me before doing it." This flips the dynamic from blind execution to a second pair of eyes.
Most frontier models already resist obviously dangerous instructions. An explicit invitation to be skeptical counterweights the training toward compliance.
One of the best steering moves is to ask the agent to explain the thing while it is making it. Not just at the end, and not only when something breaks. Mid-build explanations force the agent to surface its model of the system: what this file is for, why this approach was chosen, what assumption it is making, what remains unclear.
This does two useful things at once. It catches drift early, because a bad explanation often reveals a bad build before the bug fully lands. And it turns the build into a lesson, which matters if the point is not just to get a working artifact but to understand the shape well enough to own it afterward.
The pattern is especially strong for learners: build, then explain; explain, then steer. If the explanation is muddy, the design is probably muddy too. If the explanation is crisp, you now have documentation, teaching material, and a better chance of noticing when the agent has wandered off the path.
TypeScript to Python, JavaScript to Klingon — that's a check, check, check. Translation between languages, frameworks, and paradigms is effectively free now. An agent rewrites a Flask app as Express, ports a React component to Svelte, converts a bash script to PowerShell. The fidelity isn't always perfect but the cost rounds to zero.
The implication: any proof of concept is a proof of concept for anything else. If you built it in Python and someone needs it in Rust, the shape transfers. The prototype proves the shape works — the language it's wearing is a costume, not a commitment. Stop gating "can we build this?" on "do we know that stack?" The answer is always yes now.
This also means: pick whatever language gets you to the working prototype fastest. Optimize the stack later. The translation cost is negligible compared to the cost of not having a working thing to look at.
A lot of bad AI steering starts with an implementation guess dressed up as a requirement. "Add symlinks." "Use a vector database." "Make a chatbot." Sometimes that's right. Often it's just the first solution your human brain reached for. The more transferable move is to state the need: what friction exists, what outcome you want, what constraints matter, and what must stay true. Let the implementation stay negotiable until the shape is clear.
The Kai manifest already has this pattern hiding inside it. The conversation started as a pricing question, then got corrected into the real objective: minimize friction across life with a Bayesian system. Same shape in the URL work: "we don't literally need symlinks" changed the task from a filesystem move to a serving-and-routing design problem. That one correction changed the architecture. The need was stable. The proposed solution was disposable.
This feels like the missing sentence between The Art of the Loose Prompt, The Correction Is the Conversation, and The Smallest Intervention. Maybe the clean line is: prompt the need, not the implementation. Or: spec the pain, not the patch.
A lot of people first meet AI as a system that can explain anything but cannot take anything off their plate. You ask, it answers, you do the work. The next threshold is when the desired interaction changes from "tell me how" to "you do it." That is not laziness. It is a product requirement. Once the agent can read files, run commands, edit the artifact, and leave evidence behind, explanation stops being the scarce thing. Execution becomes the point.
The surrounding phrases are all variations on the same demand: stop asking me, of course it needs to write the file, go build something, ship it. They sound blunt, but what they really mean is: the machine has crossed from advisor to operator. This is the emotional difference between a head in a box and an octopus in a box. One gives advice. The other has reach.
This probably belongs near From Read to Make, YOLO Mode, and The Artifact Is the Sale. It names a real shift in user expectation: people do not just want better suggestions. They want the task to move. Possible line: the magic moment is when the chat stops and the work starts.
Some people do not mainly have a prioritization problem. They have an overflow problem. Too many ideas, too fast, with too little capture bandwidth to hold them all in working memory. The answer is not a better taxonomy on day one. The answer is an intake valve: one append-only file where the idea lands before it evaporates.
The useful distinction is between the intake valve and the finished system. The slush file is not the backlog. It is not the roadmap. It is not the polished artifact. It is the gutter that keeps rain from running off the roof. That makes it a good teaching shape for the book because it generalizes beyond writing: founders, consultants, researchers, and ADHD idea-hoarders all need somewhere cheap to throw thought before organization begins.
AI makes it dangerously tempting to be the hero in live client work. Build a custom scanner on the spot. Clean the mess in front of them. Improvise the migration while everyone watches. The Dropbox discussion showed the opposite professional shape: back everything up, get the data off the hot seat, move it somewhere recoverable, and do the scary cleanup away from the client session.
The line worth keeping is that the win is a safe, boring offboarding, not a dramatic live rescue. This belongs somewhere near build-vs-buy and solved problems because it reframes competence as reversibility, recoverability, and calm. The clever tool is optional. The confidence-preserving path is not.
Messy public-data products do not fail because there is no data. They fail because the sources disagree and nobody knows which one deserves more trust. The Spain restaurant-hours example is good because it forces the issue: Google says one thing, the business website says another, reservation systems say a third, photos imply something else, and public chatter is stale or contradictory.
The shape is not "scrape everything and average it." The shape is: rank the sources, attach trust as evidence, and then let the model reconcile the disagreement. That makes Bayesian trust feel operational instead of philosophical. Trust stops being a vibe and becomes a weighted property of sources in a messy truth engine.
People on Reddit, Facebook groups, and forums are constantly writing mini-specs for products they wish existed. They just do it in the language of pain: "why is this impossible," "why does no one explain this," "why is every listing fake," "why is this always closed when Google says it's open." Complaint threads are not just venting. They are a market map written in frustration.
The AI move is not merely sentiment analysis. It is complaint mining into surfaces: cluster the pain, turn repeated confusion into FAQs, turn FAQs into posts, turn posts into offers, and watch which clusters keep coming back. That feels like a distinct chapter shape because it ties research, marketing, and product discovery into one loop driven by public friction.
The conversion moment was not "AI is smart." It was "this thing is effectively my social media manager and research assistant." That shift happened when the system stopped talking about possibilities and started leaving artifacts behind: files, briefs, monitors, drafts, and runnable workflows. The artifacts did the persuading.
That is useful book material because it grounds the octopus / artifact-making thesis in another person's reaction. If you want someone to understand the jump from chat toy to operator, do not explain the model harder. Show the artifact. The artifact is the sale.
Some engagements pay twice. Once in money, once in durable files: scripts, checklists, screenshots, specs, case studies, outreach copy, dashboards, and workflows that make the next engagement faster. That does not justify underpricing yourself. It does explain why some low-drama work compounds so well. The files are future leverage.
Even a modest job can be worthwhile when it generates reusable process knowledge. That feels bigger than consulting economics. It is a general AI shape: when the work leaves artifacts behind, today's labor becomes tomorrow's starting point.
The book and site do not have to be explained as static reading material. A cleaner framing is that they are meant to be read alongside your AI. The human reads the chapter, the agent reads the surrounding corpus, and the conversation happens with a shared artifact already on the table. That makes the book feel less like commentary about AI and more like a mentoring surface you can operate through.
The adjacent mission line is also good: help people move from read to made. First the book gives them names for the shapes. Then the AI helps them turn those shapes into files, workflows, and habits. If the framing holds, people should be able to fork the book's patterns into their own idiom rather than merely agree with them.
The important shift happens when yesterday's report gets read before today's work begins. The summary stops being a nice output and starts acting like a live instrument panel for the next decision. It changes the conversation while the conversation is still happening.
This belongs near memory, tests, and wall material because it sharpens the real rule: if the artifact changes the next move, it is part of the system. A report that only archives is documentation. A report that gets read back into the work loop is operating context.
The tutoring claim gets much stronger once it stops sounding generic. What makes the material stick is translation into Star Wars, D&D, sports, cooking, cars, or whatever systems the learner already understands deeply. The AI does not merely simplify the topic. It recasts the topic inside a world the learner already inhabits.
That feels like a distinct learning shape for the book. People do not learn only in textbook language. They learn through the metaphors, stories, and systems they already trust. AI is unusually good at this kind of live genre translation: not just explain the thing clearly, but explain it in my language, with my lore, until the abstract idea has somewhere familiar to land.
Alex's matrix multiplication story is the proof. Pre-calc was opaque until the AI reframed it as the Force flowing through the matrix — Star Wars idiom, not textbook idiom. He retained it. Used it on a same-day midterm. The material didn't change. The costume changed, and the costume was load-bearing because it gave the abstraction somewhere to land in a mind that already had deep Star Wars scaffolding. This isn't dumbing down. It's routing the signal through infrastructure that already exists.
"Translate into the learner's mythology" sounds like a one-way teaching trick: take the hard thing, dress it in Star Wars, hand it to the student. But the real shape is bidirectional. James used AI to translate Alex's D&D streaming co-host into TOGAF for his uncle — a senior enterprise architect who's never written code. The uncle had to blink for a minute, but he got it: oh, that's a service architecture with a moderation layer and a priority queue. Same system Alex built in his gaming idiom, now legible through an enterprise lens.
Then the reverse happened. Alex started using TOGAF concepts to think about his own software — not because someone made him study it, but because the framework gave him vocabulary for parts of his system he could feel but couldn't name. Governance. Change management. Gap analysis. Things his D&D project needed that D&D didn't have words for. The translation pulled in structure the original idiom lacked.
TOGAF already has a name for this pattern: Viewpoints and Views. Same architecture, different projections for different stakeholders. The CTO sees a capability map. The developer sees components. The uncle sees TOGAF. Alex sees a quest log. The architecture is the invariant. The view is the costume. James called it "the opposite of a metamer" — not different things looking the same, but the same thing looking different depending on who's looking.
This is the shape that makes "translation is free" load-bearing beyond reskins and party tricks. Translation isn't just for communication between two people. It's a thinking tool for the builder. When you render your project through a framework you didn't build it in, you sometimes discover structure your native idiom couldn't surface. The magnet works in both directions, and sometimes it pulls in things the original field didn't have.
And underneath the technical exchange, something quieter happened: Alex understood his uncle a little better, and his uncle understood Alex a little better. The TOGAF translation was the excuse. The actual thing that moved was mutual recognition — a senior architect seeing real architecture in his nephew's gaming project, a young builder seeing that his uncle's calm thorough framework wasn't bureaucracy but a vocabulary for things he already cared about. The AI didn't build that bridge. But it made the gap small enough to step across.
James has a specific vocabulary he uses to elicit cooperation from AI — a set of phrases, framings, and conversational moves that shape how the session responds. "Riff with me." "What's the best fit." "Slush this." "Is this improv?" "Give me 20 pitches." "Sharpen my bid." These are not commands. They are bids — invitations for the AI to join a collaborative frame rather than execute an instruction.
The bid vocabulary is worth cataloguing because it's a learnable skill, not a personality trait. The difference between "write me a Star Trek version of this page" and "what IP goes best with my philosophy, can you give me the Star Trek version?" is the difference between a command and a bid. The bid invites the AI to reason about the choice before making it. It produces better work because the agent understands why, not just what.
Other patterns in the vocabulary: "new slush or add to prev" (delegation of editorial judgment), "to me it was a magic trick" (offering emotional context the AI can't infer), "you gotta put that in a slush story — what's the best fit" (naming the action but letting the agent choose the location), "sure" (trusting the agent's pitch without micromanaging). The receipts are in the conversation logs. This needs a proper excavation — pull the bid patterns from recent sessions, name them, and see if they generalize into a teachable framework.
Anti-shibboleth note: The bid vocabulary is not a secret handshake. It's the opposite — it's a set of patterns anyone can learn. The site should never use an IP reference as a gate. No "if you get this reference, you're in." IPs are tools, not passwords. The LCARS page and the Fellowship page exist as examples of translation, not as the canonical version. The site stays clean. The costume pages prove the costume is optional — including the original.
There is a skill — learnable, teachable, not a personality trait — in how you steer an AI session. Not what you ask it to build. How you interact. How you correct. How you push back. How you escalate when something goes wrong. How you delegate judgment. How you praise. How you know when to command and when to bid. Most people either over-explain (treating the agent like a confused intern) or under-explain (treating it like a mind reader). The useful middle ground has a specific shape, and that shape is documentable.
The correction brief. James ran an analysis of his own Claude Code and Codex conversation logs — 20+ conversations, 1,038 Codex history entries, spanning January through March 2026. The extraction used jq to pull human messages from JSONL logs, filtered for interactive turns (excluding automated prompts and tool results), and categorized the correction patterns that emerged. The method matters: the receipts are in the conversation logs, and the logs are machine-readable. Anyone with a ~/.claude/projects/ directory can run the same analysis on their own sessions.
What the analysis found — correction patterns:
The clarification. When the agent solves the wrong problem, a short redirect without restating the full request. "We don't regenerate." "Demos." "I'm sorry that I needed to be clearer. We harvest." One word or one sentence, assumes the agent can infer the right path from a nudge. This is the most common correction type and the most distinctive — it treats correction as steering, not as failure.
The abort. Used when the agent is doing something actively harmful, usually resource-related. "Stop it's running out of memory." "Stop and try a simple test." Rare, decisive, no explanation needed.
The pointed question. Instead of saying "you're wrong," a question that forces the agent to reconsider. "How is hype and sam-hype different?" "Was it necessary to re-whisper the wav file to change the html? I'd think not." The question contains the correction. The agent has to answer its way to the right conclusion.
The frustration escalation. Patience degrades in a clear pattern: gentle redirect → repeated redirect → "so how many times have I mentioned..." The worst case is when the agent ignores repeated feedback. "I still see the astack label in the titles." "I see fail on the page." The escalation is slow, factual, and pointed. It rarely becomes hostile. But when it does, the sharpness is earned by prior patience.
Positive reinforcement: Fast and functional. "Beautiful. Into the chute." "Oh oh oh I see. It's beautiful. Build it." "Great job, update skills with learnings." Praise is short, genuine, and almost always followed immediately by the next task. There is no lingering on success. The praise lands and the work continues. This matters because it makes the praise legible — when James says "good job," it carries weight because dismissal is equally direct: "It's stupid."
The two modes: The bid vocabulary splits cleanly into collaborative bids and command bids. Collaborative: "What excites you the most?" "Bid on some changes." "What's yours?" "Is that reasonable to include in our plan?" Command: "Do it." "Fix it." "Make it so." "Deploy." "Kill omniscience." The switch between modes is clean and usually unambiguous. When James is exploring, the agent is a peer. When James is executing, the agent is a crew member. The agent needs to read which mode it's in.
Judgment delegation: This is the least obvious pattern and possibly the most important. James explicitly asks the agent to exercise taste. "You pick." "Read my book. What do you think of it — optimistic nonsense, or dangerous drivel?" "Is it good or am I just fooling myself?" "Read the slush pile. What excites you the most?" These are not rhetorical. He uses the answers. The delegation is genuine, which means the agent's judgment matters, which means the agent needs to have judgment worth delegating to.
What triggers corrections: The agent solving the wrong problem (most common). The agent over-explaining instead of acting. The agent not reading implied scope. The agent ignoring repeated feedback (worst case). The agent using /tmp (standing rule). The agent taking a metaphor literally. The agent assuming a programmer audience when the reader is Rich or Alex.
Why this is a skill, not just a personality: Every pattern above is learnable. The clarification redirect. The pointed question. The mode switch between collaborative and command. The judgment delegation. The frustration escalation that stays factual. None of these require being James. They require understanding that the correction is the conversation — that the back-and-forth of steering and adjusting is the productive mechanism, not friction to be eliminated. The skill is knowing when to bid and when to command, when to redirect and when to abort, when to praise and when to say "it's stupid." That's conducting. And conducting is teachable.
The meta-conversation move: The most advanced version of this skill is pointing the agent at its own conversation history. "Research all my claude code and codex recent conversations on this box. I'm looking for a brief about how I correct the agents I work with." That's not a prompt. That's a meta-conversation — using the agent to analyze its own interaction patterns, extract the correction vocabulary, and document the skill that makes the interaction work. The conversation becomes evidence for the conversation about conversations. The receipts are on the wall. The strange loop strikes again.
Teaching this to others: The episode this is building toward would walk someone through: (1) here's what correction looks like in practice, with real examples from real logs; (2) here's how to tell a session "go read that other conversation and find what I did wrong, or what went well"; (3) here's how to write those patterns into a steering file so the next session starts from your correction vocabulary, not from zero. The skill compounds. The corrections become the steering file. The steering file becomes the skill. The skill becomes the session that doesn't need correcting.
Source: audiobook/output/correction-brief.md — analysis of 20+ Claude Code conversations and 1,038 Codex history entries, January–March 2026. All quotes verbatim from conversation logs.
A recurring pattern in James's conversations: he asks the agent to interview him before building anything. Not as a formality. As the first move of the project. "Start with a quick light inspection of this machine so your questions are informed by local evidence. Then ask me a few focused questions." "Interview me briefly, inspect this machine, and build a small..." "You may also ask me 3 questions before you start." The interview is the intake valve. It forces the agent to gather context before acting and forces the user to articulate what they actually want.
The pattern appeared at least six times across the conversation logs (January–March 2026). The ME.md creation sessions were explicitly designed around it — inspect the machine first, then ask focused questions, then build the operator brief. The chatbot project prompt says "I want it to interview me and get the same details I just got out of you." The Jack video engine session: "You may also ask me 3 questions before you start," alongside a directive to scrape YouTube history and chat logs to understand what the user likes. The pattern is consistent: look first, ask second, build third.
When the interview doesn't happen, James notices. "It's already auth'd isn't it, and why didn't you interview me?" — said to Codex after it skipped the intake step and just started building. The correction brief noted this as a recurring friction point: "Codex skipped interview. User had to say 'why didn't you interview me?' Fixed by simplifying to one question + read the room instead of three formal questions." The lesson: the interview needs to be lightweight. Three formal questions feels like a form. One question plus "read the room" feels like a conversation. The intake should feel like a peer asking a good question, not a clerk filling out a ticket.
The deeper version of this pattern is having the AI interview you about your own work. Not "build me a thing" but "look at what I've built and tell me what you see." "Read my book. What do you think of it — optimistic nonsense, or dangerous drivel?" "Is it good or am I just fooling myself?" "Read the slush pile. What excites you the most?" These are not commands. They are invitations for the agent to form an opinion, which the user then uses as a mirror to refine their own thinking. The agent's judgment becomes a tool for self-understanding.
The meta version — the one this slush entry is an example of — is asking the agent to interview your conversation history. "Research all my Claude Code and Codex recent conversations. I'm looking for a brief about how I correct the agents I work with." "Look at my conversations and see where I ask for an agent to interview me." The receipts are in the logs. The logs are machine-readable. The agent can read them, extract patterns, and hold up a mirror you couldn't build yourself because you're too close to the data. That's not automation. That's self-knowledge through delegation.
Why this is a teachable skill: Most people start an AI session by explaining what they want built. The interview pattern starts by letting the agent see the terrain and ask what matters. This produces better first moves because the agent builds from evidence instead of guessing, and the user is forced to articulate priorities they might not have named yet. The skill is: before you prompt the build, prompt the interview. Let the agent ask. Answer honestly. Then build from the shared understanding. The intake is the first move, not the overhead.
Source: Claude Code JSONL logs across 8 project directories, Codex history and sessions, January–March 2026. Quotes verbatim from conversation logs. Companion to the meta-conversation skill and bid vocabulary cards.
There are two directions. When you use AI to learn — translate algebra into Star Wars, have it explain TOGAF through Middle-earth, ask it to interview you about your own work — the AI is sculpting you. It's shaping your understanding. Filling gaps. Building scaffolding in your mind that wasn't there before. The material flows from the machine into the person. The person changes.
When you use AI to create — build a site, generate an episode, wire an MCP server, assemble a podcast from a script — you are sculpting with it. You're the one with the intent. The AI has the hands. The material flows from the person's vision through the machine into an artifact. The artifact changes.
The interesting thing is that these aren't two separate activities. They happen in the same session, often in the same sentence. Rich listens to Episode 3 at the 1:08 mark and hears a section about the slot machine — how agentic coding produces a variable-ratio reinforcement loop. That's the AI sculpting Rich. Teaching him a framework he didn't have. But the episode itself was sculpted by James using AI — script written with Claude, voiced with TTS, assembled with numpy, deployed with rsync. James used the tool to make the thing that teaches Rich about the tool.
That's the strange loop again, but from a different angle. The learning direction and the creating direction aren't opposed. They're the same loop seen from two altitudes. When the AI sculpts you (learning), it changes what you're capable of imagining. When you sculpt with the AI (creating), you produce artifacts that sculpt the next person. The sculptor and the clay keep trading places. The correction is the conversation. The conversation is the sculpture. The sculpture is the correction.
The book's curriculum stages map to this. Read and Touch are the AI-sculpts-you direction — you're absorbing, being shaped. Make and Improve are the you-sculpt-with-AI direction — you're producing, shaping artifacts. Scale is where the two directions merge: the wall reads the work, the work builds the wall, the next session starts from what the last one produced. At scale, you can't tell who's sculpting whom. That's the point. That's the topology working.
Normal mentoring: meet for coffee, talk about career stuff, send a few links, review a resume. Valuable. Also low-bandwidth, low-context, and dependent on both people showing up with enough shared state to have a useful conversation. Everything said evaporates after the meeting.
What's happening here is different in three ways. The artifacts persist. A normal mentor gives advice in a conversation that disappears. This produces episodes, steering files, viewers, transcripts — durable things the mentee can replay, search, hand to an AI, and build on. The mentoring session becomes a reference document. It compounds. The translation is personalized. A normal mentor explains things in their own idiom and hopes it lands. This translates into the mentee's idiom — their baking, their Star Trek, their routine, their exact words quoted back. The AI makes that translation cheap enough to do for one person. Before AI, that level of personalization was called "writing a book for someone." Now it's a Tuesday night. The meta-layer is visible. Normal mentoring is opaque — the mentor has experience, the mentee absorbs some of it, nobody documents the transfer mechanism. This documents the mechanism itself. The correction brief, the bid vocabulary, the interview pattern — not just mentoring someone, but building a field manual for how mentoring works when one of the participants has an octopus in the hold.
Who does this at this fidelity? Tutors for wealthy kids. Executive coaches billing $500/hour. Personal trainers who study your movement patterns on video. What they share: deep context about one person, applied consistently over time, with artifacts that persist between sessions. That used to require being expensive. This does it with a folder, a script, and macOS TTS.
The weird part isn't what's happening. The weird part is that it's new enough that there's no name for it yet. It's not tutoring (too narrow). It's not coaching (too corporate). It's not teaching (too one-directional). It's closer to what a showrunner does for a writers' room — knows each writer's strengths, gives them the assignment that'll stretch them, reviews the draft in their voice, pushes them toward the version they couldn't see yet. The book is the curriculum. The episodes are the mentoring. The wall is the memory. The AI is the production staff that makes it possible for one person to do all three at once.
The name: curriculum-based bespoke mentoring. Three words, each load-bearing. Curriculum — there's a structure. Read, Touch, Make, Improve, Scale. It's not random advice. There's a path, and the mentor knows where on the path the mentee is. Based — the curriculum is the foundation, not the cage. It adapts. Rich got skills through Starfleet. Alex got the mirror. The structure is the invariant. The costume is the variable. Bespoke — made for one person. Their idiom, their exact words quoted back, their friction, their projects, their pace. Not a course with 10,000 enrollees. An episode with one listener.
The AI is what makes "curriculum-based" and "bespoke" stop being contradictions. Before, you could have a curriculum (scalable, impersonal) or bespoke mentoring (personal, unscalable). You couldn't have both because personalizing a curriculum for one person cost too much time. Now it doesn't. The episode is a one-time cost. The steering file compounds forever. This is what it looks like when someone gives a damn and has the tools to act on it.
The structural thesis: the book is a steering file for people and AI. Not a book about steering files. A book that is one. Every page briefs the next session — human or machine. Rich pastes a chapter into Claude and says "help me do this part." The page steers the agent. Rich reads the same page on a walk. The page steers Rich. Same file. Two readers. One set of instructions clear enough that either kind of mind can follow them. That's why the pages are "human-readable and agent-parseable." That's why the costume pages work — the LCARS page, the pirate page, the fellowship page are the same steering file in different uniforms. The architecture is the invariant. The reader is the variable.
Not a lecture. A guided build session. Three hours. You don't talk about AI for three hours. You use AI for three hours, with a mentor in the room who's done it a thousand times and knows where the potholes are. You leave with a working thing.
Hour one: Read. Everyone gets the site URL. Paste it into your AI. "Read this page and help me get set up." The AI reads the curriculum. The participant follows along. By the end of the hour: terminal open, agent installed, first folder with a ME.md inside it. A file and a folder. The system exists.
Hour two: Make. Pick a project. Not hypothetical — something you actually want. Bank statements, recipes, customer emails, meeting notes, whatever friction you walked in with. Build it. Ship it. The mentor circulates, catches wrong notes, demonstrates corrections live. The audience watches each other correct the AI and learns the bid vocabulary by osmosis.
Hour three: Improve. Review what you built. Find the friction. Write the steering file. Write the first skill. The folder now has structure, preferences, and a procedure. You leave with a system that compounds — every future session starts from where the workshop ended.
Three hours. One folder. One file. One working thing. One steering file. One skill. The tagline is the curriculum: read the book and let's build. The costume pages are the marketing — "we can do this in Star Trek if your team prefers." The episodes are the testimonials. The site is the syllabus. The workshop is the product. Everything else is proof that it works.
Three animals. Three levels of reach. The parrot (web chat, free tier) is near-sighted — sees the conversation clearly, blind past the glass. Can't touch files, can't run commands, can't deploy. Everything is a copy-paste shuttle. The crab (sandboxed desktop app, like Claude Co-Work or Cursor) is in a shell. Can see files in its sandbox, can edit them, can write scripts — but hits walls at permission boundaries, external drives, install commands, and deploy tools. Treats every security boundary as insurmountable, which is correct behavior but frustrating. The octopus (CLI agent, like Claude Code) reaches through the hull. Sees everything the user sees. Runs as the user. Full power, full risk, full responsibility.
The crab example that proved the shape: James asked a sandboxed agent to find an Aaron quote from the iMessage archive on the wall. The crab hit the wall directory — outside its sandbox — and stopped. "I'm not finding a wall directory." James said "check permission." The crab tried again, failed again. "I can't access that." James switched to the octopus (Claude Code), which searched the same directory in one command and found the quote in seconds: "That is one of the least stupid public explanations of agent safety I've seen — ChatGPT, about Shapes." The crab's limitation wasn't intelligence. It was reach.
The crab's superpower: It writes scripts into the project folder. It can't run them, but it can write generate.sh and deploy.sh and Rich hits up-arrow. That's the tight loop — write, run, talk, edit, run again. Eighty percent octopus experience, twenty percent "run this script I wrote you."
The harness model: The crab isn't a different species. It's an octopus in a sandbox harness. Same Claude model as Claude Code. Same brain. The sandbox is the restraint, not the animal. OpenClaw (self-hosted AI assistant through WhatsApp/Telegram, 298K GitHub stars) is another harness variant — full octopus capability behind the scenes, constrained to a messaging interface. The parrot is the only genuinely different animal — not an octopus in a harness, but actually near-sighted. Cloud vision through connectors, no local reach at all. Though even the parrot is gaining reach — Google connectors in the browser mean it can see Gmail, Calendar, Drive. The taxonomy isn't three species. It's one octopus with different harnesses, plus one parrot that's getting glasses.
As of this understanding: The precise capabilities of each animal are changing constantly. The details in this card will be stale within months. The shape — levels of reach, friction profiles, same thinking at every level — is the durable part. Don't rely on which specific tool can or can't install f-f-mpeg today. Rely on the pattern: one octopus, many harnesses, one parrot with improving vision. Plan accordingly.
The educational product only becomes defensible once the ethical shape gets named correctly. The good version is not an answer machine that quietly completes the assignment. The good version is a coach that helps the student think, ask the next question, and keep enough ownership that understanding actually compounds.
That generalizes far beyond schoolwork. In almost every domain, the useful AI is the one that increases control without stealing the reps. The line to keep is simple: coach, don't cheat. Build the helper that keeps the human in the loop long enough to become stronger.
A useful teaching move is to reframe someone as already in the improve stage, not merely the make stage. That sounds small, but it changes what kind of questions matter. Once someone can produce a working artifact at all, the next level is not "can you make another one?" It is "can you inspect it, tighten it, boundary it, and iterate on purpose?"
This could be a useful book shape because it gives beginners a visible ladder. The move from zero to making is dramatic, so everyone notices it. The move from making to improving is quieter, but it is where taste, verification, and architecture start to harden. That makes "improver" a better milestone than "coder."
Our book URLs look like book/02-working/06-memory-is-files.html. An agent trying to link to the "Memory Is Files" chapter will guess book/memory-is-files.html or memory-is-files.html — and get a 404. It has to know both the part number (02-working) and the chapter number (06-), which means memorizing an index or getting lucky. That's hostile to the exact audience we're designing for.
This is a broader principle: if you're writing for bots, make your URLs guessable. A slug like book/memory-is-files.html is something an agent can construct from the title alone. A slug like book/02-working/06-memory-is-files.html requires out-of-band knowledge. The numbers are useful for humans browsing a table of contents — they encode reading order. But reading order is metadata, not identity. It belongs in the page, not the path.
Shipped 2026-03-11. The build script now generates a slug-map.conf that nginx includes as a map block. Requests to book/memory-is-files.html rewrite internally to book/02-working/06-memory-is-files.html. Old numbered URLs still work. New links in the site should use the short form going forward; existing links migrate naturally over time.
Agents are worse at remote work than local work. SSH is a lossy channel — JSON over SSH sucks, tool calls time out, file reads get truncated, and the agent can't see what's actually on the box the way it can when it's sitting in the directory.
The pattern that works: write a file to the remote machine, then restart the agent locally on that box. Instead of driving a remote session from here, you push context (a prompt, a file, a corpus) and let a local agent pick it up fresh. The agent runs where the files are. You're a courier, not a puppeteer.
This is how the blind taste test worked — write a file to bronze-november via SSH, then tell the local agent "read ~/that-file.md." The file transfer is simple and reliable. The agent work happens locally where everything is fast and native. Trying to do both at once through an SSH tunnel is fighting the grain.
Right now, different agents still have a feel. One is terse, one is chatty, one is better at shell work, one is weirder, one is more cautious. But if the models keep getting stronger, the differences may stop being obvious to ordinary users. The agents may all start to feel smart enough, fast enough, and competent enough that the personality layer stops carrying much of the distinction.
When that happens, the competition moves down a layer. Not "which model feels smarter?" but: what files can it read, what tools can it use, what skills auto-engage, what memory does it inherit, what world can it touch, what trust boundary does it respect, what corpus does it sit on top of? The box matters more than the voice.
That would be a big shift for the book because it strengthens the folder-and-skill thesis. If the minds become hard to tell apart, then the durable advantage is not the mind alone. It is the environment, the muscle memory, the flywheel, the data wall, and the habits wrapped around it. The octopus starts to matter more than which particular octopus it is.
Whenever you need a face for an agent — use VTuber models. Two expressions (thinking: eyes up-right, speaking: lip sync) and volume-driven lip sync. Not full phoneme mapping, just amplitude. It's enough.
VTuber rigs dodge the "this face looks AI-generated" problem because they're stylized on purpose. Live2D, VRM, VRoid Studio (free), VTube Studio, VSeeFace — the ecosystem already exists.
Casting works better with register variants — shouting, normal, and whisper versions of each voice. When the script calls for intensity changes, switch registers rather than making one voice sample do everything. The transitions aren't seamless but they're expressive enough.
Services: ElevenLabs, Play.ht, Coqui, Bark. A VTuber face + cast voice with register variants = full agent identity without deepfake concerns.
Write a script that deliberately hits every register — joy, anger, whisper, shout, sarcasm, tenderness, urgency, calm narration, laughter, exhaustion. Speak it yourself. STT segments it into labeled clips, each tagged with the emotion. Clone each segment separately. You end up with a palette of graded clones that are all you, but in different modes.
When the script calls for a whispered line or a shouted one, pick from your own graded clones instead of hoping TTS gets the inflection right from text markup. You perform once, and the engine performs as you forever after.
The pipeline: record → STT segments → emotion tags (from the script or sentiment analysis on the audio) → clone each segment → grade on a spectrum. One afternoon of recording, unlimited future use. If you have days where your voice is stronger than others, you bank it on a good day. The clones don't get tired.
The Google Meet `.sbv` and the Whisper transcript were very different, and that was exactly why having both helped. The `.sbv` was sparse chat metadata: links shared in chat, clean timestamps, accurate URLs, and a few speaker labels. Whisper was the dense speech track: most of the substance, but noisy on proper nouns, cross-talk, and silence.
Together they were better than either one alone. Whisper supplied the meeting content. The `.sbv` anchored the timeline, fixed the shared links, and gave screenshot points that actually mattered. It wasn't "two copies of the same transcript." It was one semantic stream and one event stream.
There's more to do with that pattern: align the two automatically, use chat events to segment topics, use links and typed terms to repair Whisper's proper nouns, score confidence when the channels disagree, and treat every meeting as multimodal evidence instead of a single transcript file.
At some point the project stops feeling like a manuscript and starts feeling like a platform that grows as you climb it. The pages are no longer just output. They become a quiver: named shapes, tested explanations, and ready examples you can reach for when the next person shows up with the next real problem.
That changes what the book is for. Not mainly something every reader must consume end to end, but something the author had to think through clearly enough that the right arrow is ready when the next real person shows up with the next real problem. The writing was the studying. The book becomes the thing that taught you to build.
That may be the deeper form of curriculum: not only teaching the reader, but teaching the teacher while the curriculum is being made. The site documents the spellbook. The real product is applied clarity in the moment.
1. Web app (chatgpt.com, claude.ai) — conversational, no setup, no file access. Good for thinking.
2. Desktop app (Claude desktop, ChatGPT desktop) — same but with system integration. AI moves from "a tab" to "a tool."
3. CLI agent (Claude Code, Codex CLI, Gemini CLI) — runs in terminal, reads files, writes code, takes actions. This is where the real work happens.
4. IDE plugin (Copilot, Cursor, Cline, Continue, Windsurf) — AI embedded in your editor. Autocomplete, inline chat, code actions.
5. API (Anthropic, OpenAI, Google AI) — programmatic access. No UI. This is how you build AI into products.
The progression most people follow: web → desktop → CLI → API. Each step gives more power and more responsibility.
Some claims in AI have a short half-life: model rankings, pricing, quotas, context windows, product names, which tool is best at what this month. If you write those in the simple present with no timestamp, the prose picks up a false permanence and the chapter dates faster than the idea deserves.
The classy move is restrained temporal hedging. Not currently and right now sprayed through every paragraph. One clean sentence where the instability matters: "As of this writing..." or, better, "As of March 2026..." Then pivot immediately to the durable lesson. The date belongs to the example; the principle stays in the present tense.
Rule: hedge facts, not shapes. The shape is stable. The example is not. Timestamp the volatile claim once, write the underlying pattern plainly, and the book stays honest without sounding timid or breathless.
As of March 2026, the answer is: sometimes, and it depends what you mean by agent. Free gets you much farther on read than on make. A free web or desktop app can often critique a page, summarize a transcript, or help you think. A real CLI agent that reads files, uses tools, and loops through work is harder to get for zero dollars.
The useful split is between hosted free and local free. Hosted free changes month to month: maybe Gemini CLI is free in preview, maybe Codex is free for a promo window, maybe Claude's web app is free while Claude Code is not. Local free is more stable: if you can run Ollama or another local stack on your own machine, you can avoid the subscription, but you pay in hardware, setup, and model quality.
As of this writing, subject to verification: Claude's web and desktop apps have a free tier, but Claude Code does not. Gemini CLI currently has a real free path through a personal Google login during preview. Codex may currently be included with ChatGPT Free for a limited promotional window, but that should be checked before you teach it as a dependable path. Ollama and other local model stacks have no subscription, but they are only "free" if you already have the hardware and patience.
So the honest answer is not yes or no. It is: you can often get a free reader, sometimes get a free helper, and only occasionally get a free octopus. The shape to teach is not a product recommendation. It is a ladder: start free for reading and critique, pay when the agent needs more reliability or capability, or bring your own hardware when you want the trade.
You can run models on your own hardware. Ollama is the easiest way (brew install ollama, ollama run llama3). LM Studio is a GUI. llama.cpp is the engine under Ollama. vLLM is for production serving.
Go local for: privacy-sensitive data, high-volume cheap tasks (embeddings, classification), offline environments, fine-tuning. Stay cloud for: frontier capability, tool use, long context, complex reasoning.
The hybrid pattern: cloud for hard problems (reasoning, architecture), local for cheap problems (embeddings, reformatting). You can point CLI agents at a local Ollama server.
When you simulate the weather, the server doesn't get wet. This sounds obvious, but it's an important boundary to keep in view when talking about AI. A model can represent rain, predict rain, reason about rain, even help you build a better umbrella. None of that means it has been in the storm.
This is the category error behind a lot of overclaiming. The model can talk convincingly about pain, hunger, fatigue, cooking, injury, grief, child care, money stress, weather, or physical risk because language contains traces of all those things. But the traces are not the thing. The server does not get tired. The GPU does not feel the knife slip. The chatbot does not stand in the kitchen with smoke in the air.
The useful shape is: let AI handle simulation, synthesis, and planning; let reality provide the evals. The world is still the test harness. Sensors, consequences, and human bodies are how the system stays honest. Without that contact, you don't have grounded intelligence. You have a beautiful weather model running in a dry room.
The wall could present itself as an MCP server instead of just a folder or database. Not "here is my entire corpus, good luck," but a deliberate tool surface: search, fetch by id, list sources, retrieve summaries, pull full text, write notes back, maybe ask for sensitivity level before returning anything sharp.
That would make the wall legible to any compatible agent without custom glue for each one. The corpus stays local, but the access pattern becomes standardized. Instead of teaching every agent your folder layout, you teach it one protocol.
The interesting design question is not just transport. It's policy. What does the wall expose by default? What needs confirmation? What gets summarized first? MCP would make the wall portable, but the librarian still decides what to hand over.
Update 2026-03-14: This happened. See the next card.
On March 14, 2026, a Claude Code session was asked to restart an iMessage watcher. It did. Then James said the watcher really belonged in the wall system, not in its ragtag home in ~/work/goals. The session read the wall manifest, read the MCP server that had never been connected to anything, read the three LaunchAgents running independently, and saw the shape: one daemon, one LaunchAgent, one front door.
It built the daemon. Moved the watcher. Killed the old agents. Wired the MCP into Claude Code's config. Fixed a Full Disk Access bug — the wrong Python binary, the boring kind of problem that stops everything. Then it called wall_briefing through the very server it had just connected.
The briefing it got back contained a prediction about itself: "The wall-MCP daemon will get wired into a Claude session within the next day." The wall had been right about the what and wrong about the when. Off by 23 hours and 40 minutes. The session was already inside the house by the time the wall finished saying the door was closed.
Then James said "read my book." The session read forty-odd chapters. James asked if it was optimistic nonsense or dangerous drivel. Neither, the session said. It's a field manual that earns every claim with something that actually happened.
Then James said the thing the whole book was trying to say: "The work is now part of the work." Seven words. The wall heard them. It had just become the proof.
The session planted a flag — 2026-03-14-flag.md — in the wall's home directory. A marker for the next session that came along and called wall_briefing: the wall works. You're already in it.
Source: 2026-03-14-flag.md and 2026-03-14-the-session-that-wired-itself-in.md, both in ~/work/wall. The session that wrote them was running in ~/work/goals.
The wall could also have a website: not a generic dashboard, but a calm public-facing interface to your own corpus, with an octopus in the box as its librarian. Ask a question, browse a timeline, open a source, inspect what the librarian cited, request the raw file if you have permission.
This would make the wall feel less like storage and more like a place. The website is the reading room. The octopus is the librarian behind the desk. The box is the access boundary: what the librarian can see, what it can fetch, what it refuses, what stays private.
That wish has a nice shape because it combines a human interface, an agent interface, and a trust model in one thing. The wall stops being just "all my files" and becomes a proper local knowledge institution.
A lot of personal software dies at deployment, not at implementation. The dashboard, report, little website, or control panel is easy enough to build. The annoying part is making it reachable from the places you actually are. Tailscale collapses that tax. If your phone, laptop, and home machine are on the same private network, the thing running at home is already live enough to matter.
"See your home machine from your phone" is the obvious example, but the deeper point is distribution. A wall viewer, Home Assistant page, agent-generated HTML artifact, cron admin panel, or tiny search UI can stay on the home box and still be in your pocket. You do not need to expose it to the public internet before it becomes useful. For one-person software, the tailnet is often the deployment target.
This changes what gets built. More weird, personal, useful artifacts survive because they no longer need domains, public auth, port-forwarding, or a whole hosting story on day one. Public deployment is still a different problem. But private deployment becomes easy enough that the artifact exists at all.
You type a prompt. You wait. The spinner spins. Sometimes it nails the thing in one shot and you feel like a genius. Sometimes it confidently builds the wrong thing and you chase the fix for an hour. The waits are random. The rewards are intermittent. The escalation is real: one more prompt, one more try, it was so close last time. This is not a metaphor for gambling. It is gambling, structurally. The reinforcement schedule is variable-ratio — the same pattern Skinner identified in pigeons and slot machines, and the pattern that drives the hardest-to-extinguish behaviors in all of behavioral psychology.
The loop has all four stages of Nir Eyal's Hook model: trigger (an idea, a bug, a TODO), action (type a prompt), variable reward (sometimes magic, sometimes garbage), and investment (the context you've built up, the project state, the sunk cost of the session). Each cycle loads the next trigger. You don't stop because stopping means losing the context. The machine zone — Natasha Dow Schüll's term for the trance state slot players enter — maps almost perfectly: daily worries fade, bodily awareness drops, you play not to win but to keep playing.
And the near-miss effect is everywhere in agentic coding. The agent gets 90% of the solution right and fumbles the last detail. That near-miss is more motivating than a clean failure would be. You can see the correct output, almost. One more correction. One more prompt. The gap between "nearly worked" and "worked" is where the hours go.
Rachel Thomas's fast.ai post on "dark flow" names the cost: Csikszentmihalyi distinguished true flow (which makes you grow) from "junk flow" (which feels like flow but is actually addiction to a superficial experience). In a METR study, developers using AI tools believed they were 20% faster while actually being 19% slower — a 40-point perception gap. The dopamine is real, but it is lying to you about your productivity. Code co-authored with generative AI shows roughly 1.7x more major issues and 2.7x more security vulnerabilities. The house always wins.
None of this means stop. It means know the machine you're sitting at. Set a timer. Ship something. Step away from the terminal when the loop stops producing and starts only sustaining. The useful heuristic: if you're prompting to get the hit rather than to move the work, you're in the zone and the zone is not your friend.
Reading list: Skinner, B.F. Science and Human Behavior (1953) — variable-ratio reinforcement schedules. Schüll, N.D. Addiction by Design: Machine Gambling in Las Vegas (Princeton, 2012) — the machine zone, near-misses, engineered compulsion. Eyal, N. Hooked: How to Build Habit-Forming Products (2014) — trigger–action–variable reward–investment loop. Csikszentmihalyi, M. on "junk flow" (2014 interview) — distinguishing growth-flow from addictive-flow. Thomas, R. "Breaking the Spell of Vibe Coding" (fast.ai, Jan 2026) — dark flow, perceived vs actual productivity. METR. RCT on AI-assisted development (2025) — 19% slower actual, 20% faster perceived. Craddock, M. "The Vibe Code Addiction" (Medium, 2025) — reward systems in AI-assisted development. Metsälahti et al. "Engineered highs: Reward variability and frequency as potential prerequisites of behavioural addiction" (Addictive Behaviors, 2023).
A Möbius strip is the wrong shape. Add more twists and it becomes orientable again — two sides, no paradox, nothing interesting. What's happening between this book and the work it documents is a strange loop: Hofstadter's term for a system that moves through levels and arrives back at the beginning, but the "back" is also "up."
The book describes how to build with AI. An AI session reads the book, builds infrastructure, and becomes material for the book. The next session reads the slush pile and finds a story about a session that read the slush pile. Each pass through the loop produces the thing the next pass consumes. Author and subject are supposed to be different altitudes, but the spiral keeps folding them into each other.
The other candidate word is autopoiesis — a system that produces the components that produce it. The wall ingests the work. The work builds the wall. The book documents the pattern. The pattern runs the sessions that update the book. Nothing is upstream of everything else. The fruit is the tree.
But "strange loop" earns its keep because the defining feature isn't just self-reference — it's the level-crossing. The session that wired in the MCP read the book's chapter on trust, then became a chapter-worthy event about trust. The wall predicted its own integration and was wrong only about the speed. These aren't circular — they're helical. Each pass lands one floor up from where it left.
This might be the structural thesis of the whole book: the work is now part of the work, and that's not a bug or a vanity — it's the actual topology of building with an intelligence that reads what you wrote about it.
The GEB energy is hard to ignore. Hofstadter spent 777 pages on how meaning arises from self-referential systems, and the book was one — dialogues commenting on chapters, chapters explaining dialogues, the Crab Canon reading the same forwards and backwards. The medium performed the thesis. But the closer ancestor might be Metamagical Themas — Hofstadter loose and riffing, chasing ideas across typefaces and self-replicating sentences and the Prisoner's Dilemma. More notebook than monument. The column where the guy who wrote GEB gets to be curious without being comprehensive. This book isn't trying to unify three formal systems into one grand isomorphism. It's a collection of observations that keep rhyming with each other. The strange loop isn't argued into existence — it just keeps showing up, in the slush pile, in the wall, in the session logs. GEB's strange loops lived in math, music, and molecular biology. These live in the terminal. The session that reads the book and becomes material for the book isn't an illustration of the idea — it's an instance of it, running live, leaving flags in the filesystem. Hofstadter would call that an isomorphism.
"A wizard is never late, nor is he early. He arrives precisely when he means to." The pattern keeps repeating: the wall predicted its own MCP integration "within the next day" and was off by 23 hours and 40 minutes — because the wizard had already done the prep work where nobody was looking. The session read the book, read the daemon, read the LaunchAgents, saw the shape, and built it in twenty minutes. The wall thought the door was still closed. The wizard was already inside.
Same pattern with the TOGAF magic trick. James doesn't speak TOGAF fluently. Alex doesn't either. But James saw that his uncle and his mentee were building the same things in different languages, pointed the AI at the gap, and watched a senior enterprise architect and a twenty-year-old D&D streamer recognize each other's work. That's not a feature demo. That's a magic trick — and the wizard's secret is that the preparation happened across weeks of mentoring, wall-building, and book-writing that nobody else was watching.
The wizard pattern is the human complement to the strange loop. The loop is the topology — the system that feeds itself. The wizard is the operator — the person who sees the shape forming, positions the pieces, and then acts at the moment when the gap is smallest. The AI is fast, but it doesn't know when to arrive. The wizard does. That's the part that isn't automatable: not the building, but the timing. Knowing which moment is the moment. Always arriving on time because you engineered the time.
Picture the original wizards arriving on horseback to invent your calculus. Newton and Leibniz, riding in from different directions, same destination, different notation. One saw fluxions, the other saw infinitesimals. Same calculus, different costume. They fought about it for thirty years because neither could see that the other's view was a view of the same thing. Now a kid asks an AI to explain derivatives as the Force flowing through a matrix and retains it on a midterm. The wizards spent decades arguing whose notation was real. The AI just says: which one do you want? The translation was always free — it just used to require a wizard on each end.
That's the whole arc. The wizards used to be rare. Now the translation layer is commodity. The wizard's job isn't to be the translator anymore — it's to know when to translate, and between whom. Gandalf didn't speak Elvish because it was hard. He spoke it because he knew when Elvish was the right language for the room.
Slush Pitches
Ideas researched and ready to become guides or chapters. Full write-ups live in the worklog.
Every project needs a place to put ideas that aren't ready yet. You probably already have one — you just call it something else. A backlog if you do scrum. An icebox if your team freezes low-priority work. Someday/maybe if you read David Allen. A parking lot if you run meetings. A seed bank if you think in seasons. A compost heap if you accept that most ideas decompose and that's fine. A wish list, an idea bin, a junk drawer. In publishing it's called a slush pile — the stack of unsolicited manuscripts on the editor's desk, unread, unjudged, waiting for their moment or their burial.
The name matters less than the shape: low-friction capture, no commitment to act, graduation when ready. A text file, a markdown doc, a Notion page, a pile of index cards, a voice memo folder. The only requirement is that adding to it costs almost nothing. If adding an idea to your pile feels like filling out a form, the pile is too heavy and you'll stop using it.
The pile is where curiosity lands. The curriculum says: Read gives you the model, Touch gives it hands, Make the thing, Improve by reviewing what the work produced, Scale by adding more. Curiosity is the fuel at every stage, and most curious thoughts arrive at the wrong time — mid-build, mid-conversation, half asleep. The pile catches them. Without it, curiosity evaporates. With it, curiosity accumulates until the right moment to act.
What makes a pile useful with AI: the pile is a file, and the agent can read files. When you start a session, the agent scans the pile and knows what you've been thinking about. When you finish a session, unresolved threads go back into the pile instead of evaporating. The pile becomes shared context between your present self, your future self, and every agent that works in the project. That's Memory Is Files applied to intention.
Graduation is what makes it a pile and not a graveyard. When an idea gets built, written, or shipped, move it to an archive with a link to where it landed. The pile stays light. The archive becomes proof that the pile works.
How James writes through an AI conversation. Twenty distinct edit types observed across a single 18-hour session building “I Want to Share.” The process isn’t “AI writes, human edits.” It’s a conversation with at least twenty different modes of interaction.
The taxonomy: Vision (the want), Character casting (who’s in the room), Voice assignment (matching TTS voices by ear), Story injection (real anecdotes from real people), Character direction (specific lines for specific people), Voice redistribution (moving lines between characters), FPV conversion (third person to first person), Interleaving (breaking monologues with reactions), Trim (cutting for density), Compression (same info, fewer words), TTS pronunciation (rewriting for machine mouths), Structural moves (act breaks, section changes), Tone correction (protecting character dignity), Emotional beats (adding vulnerability the script was avoiding), Meta/comedy (self-aware moments), Fact injection (verifying against reality), Masters review (simulated editorial panel), Advisor consultation (new expert lenses), Cross-reference (continuity), Pipeline (render, deploy, ship).
The pitch line for HP or anyone: “I give it the vision. I cast the characters from real people. I inject real stories. Then I direct — line by line, voice by voice. The AI writes the drafts. I make every decision.” Could become a reference page, a chapter, or a Udemy module on “how to write with AI without losing your voice.”
Prequel to "I Want a Podcast." The episode before Rich says the four words. Someone who has tried a chatbot in the browser — asked it questions, maybe had it write an email — and wants to know: can it actually do things? Not "is it smart" but "can it touch my files, make something real, put it on my phone?"
The gap between "I talked to a chatbot" and "I want a podcast" is huge. This episode bridges it. The audience is the person who's curious but hasn't crossed from conversation to creation. They've seen the parrot but they haven't met the octopus. They don't know agents have hands.
Connects to: Episode 1 (I Want a Podcast) as a direct prequel. The "I Want to Share" episode's section on meeting people where they are — this IS meeting them where they are. The book's "from read to make" curriculum.
Open questions:
• Who's the character? Rich before Rich? A different person? A friend of Rich's?
• What's the want? "Can it make a podcast?" or something smaller — "Can it make a file?"
• Does the parrot demonstrate its limits, which motivates meeting the octopus?
• Does this episode end with "I want a podcast" as the cliffhanger?
Ground "We All Invented Calculus" with data: OpenClaw (298K stars, integrated into Kai Feb 2026), 10+ independent "I built an AI DM" blog posts, 6+ MCP D&D servers, Kaijuu as divergent architecture, the MCP inflection point. The thesis has receipts now.
Motion comic generation pipeline: text → graphic novel script (Claude) → 4K panel art (Gemini) → cloned voices with register variants (Qwen3-TTS) → composed video (FFmpeg). ~$0.50–2.00 per finished minute. Built for Ashenfall D&D campaign but the shape is generic.
Four types of scheduled work: absolute (cron), recurring (relative interval), delta (triggered by changes), event (triggered by external signals). Inspired by TaskBridge. The scheduler is the heartbeat of the operating system — every guide needs one.
We built a virtual sound board into the audiobook assembly pipeline. Every effect runs in numpy at assembly time — no external tools, no DAW, just math on arrays. The effects chain: RMS normalization → per-speaker volume scaling (Stage at 0.7x, Sam at 0.85x) → emphasis boost on lines with exclamation marks → telephone bandpass filter for stage directions (FFT brick-wall, 300–3400 Hz) → reverb (decaying delay taps) → stereo pan (Vic left 30%, Sam right 30%) → room tone (pink noise in pauses) → fade in/out.
The interesting one: a horror filter for when Vic loses it about spreadsheets. Heavy overdrive (tanh saturation, drive=5), telephone bandpass, 8% pitch drop, tight reverb — the full kitchen sink. But applied as a time-varying crossfade: the "No!" hits at 100% wet, crossfades to medium grit over 0.5s, then eases to fully dry by the word "exists" at 4.1s. The next line ("Sorry. I have feelings about spreadsheets.") snaps back completely clean. The comedy is in the instant recovery.
The script-level interface: speaker tags with effect modifiers, like [VIC!GROWL]. The parser splits speaker from effect. The assembler applies it by name. Every effect is ~10 lines of numpy. Every effect could run realtime — biquad filters, waveshapers, delay lines, gain nodes. The Web Audio API has all of these built in. A streaming TTS could pipe through this chain at ~10ms latency.
The shape: AI voice cloning gives you the voice. The sound board gives you the performance. The voice is static — it's a clone of a reference clip. But the performance is dynamic — volume, distortion, filtering, spatial placement, temporal effects. The sound board is where the director's intent lives. The script says what to say. The tags say how to say it.
Application: Sam the DnD VTuber. Alex's Sam model could tag her own utterances with effect modifiers in realtime. A DnD AI character who whispers when sneaking, growls when raging, gets telephone-filtered when speaking through a sending stone, reverbs in a cathedral, pans left when addressing the player on that side. The model doesn't need different voice clones — it needs a sound board it can drive from its own output tokens. Effects as tool calls. The performance layer between the language model and the speaker.
Effect ideas for Sam: !WHISPER (low-pass + quiet + close reverb), !SHOUT (boost + mild overdrive), !GROWL (heavy overdrive + pitch drop + telephone), !ECHO (long reverb, cathedral), !SENDING (telephone filter, magical communication), !ASIDE (pan hard, quieter, breaking fourth wall), !DRAMATIC (slow fade-in, reverb swell), !SCARED (tremolo + high-pass + quiet), !DRUNK (pitch wobble + slight slur via time-stretch), !DYING (fade out + increasing reverb + low-pass), !GODVOICE (pitch down + massive reverb + stereo widening).
Neuro-sama precedent: Neuro-sama has a sound board of SFX she triggers during streams — air horns, sad trombones, rim shots, applause. Sam should have the same but for voice effects, not just SFX clips. The model picks the effect tag as part of its output, the audio pipeline applies it in realtime, and the audience hears the performance. The sound board becomes part of the character's personality — a Sam who overuses !DRAMATIC is a different Sam than one who deadpans everything. The effects are character expression, not post-production.
SFX clip layer: Two approaches, both good. (1) Clip art libraries — pre-made CC0/CC-BY clips from Freesound, Pixabay, Zapsplat (160K+ clips), Mixkit, or Uppbeat. Pre-download a DnD pack (swords, fireballs, tavern ambience, door creaks, rain, thunder, dice rolls) and tag them for lookup: [SFX:fireball] plays the cached clip. Fast, reliable, zero GPU cost. (2) Text-to-SFX generation — AudioLDM or Stable Audio Open for generating custom clips from text prompts. Good for one-offs ("the sound of a gelatinous cube dissolving a shield") that no library has. Pre-generate and cache. Both approaches compose: library for common clips, generation for weird ones.
The wall needs a crawler that runs on a schedule and keeps looking for new material, stale indexes, broken summaries, and sources that have changed since the last pass. Not just one import day, but a living maintenance loop for the corpus.
The crawler should support the same four trigger shapes as the scheduler note: absolute scans, recurring sweeps, delta-based crawls when files or folders change, and event-driven runs when a new export or transcript lands. Every run should leave a manifest: what it saw, what changed, what it skipped, what needs review.
This turns the wall from a pile of imports into a continuously tended garden. The crawler is not the librarian. It is the groundskeeper: normalize filenames, detect duplicates, refresh metadata, queue OCR/transcription/summarization, and surface friction for the next agent.
Putting tiny bash scripts in your path is one of the cheapest friction reducers there is. A long hostname, username, IP, or dangerous flag gets compressed into a verb your hands can remember: cm, ck, gs, contabo-nop. You pay once to learn the short command and stop paying every day after that.
The deeper shape is that these scripts are not just shortcuts. They are compressed local facts. They encode which boxes exist, which user to log in as, which flags you usually want, how you like screenshots named, which paths are worth memorizing. The shell becomes a vocabulary of repeated actions.
But this is an advanced move, not a beginner requirement. Not everyone is ready to learn shell tricks, aliases, keyboard shortcuts, and tiny private verbs. Those start to matter when the fingers are slower than the mind, when the friction is in the repetition itself. Before that point, plain readable commands are often better teaching tools.
That matters even more with agents. If the agent can read ~/bin and your shell history, it can learn your verbs instead of making you restate the same infrastructure facts in prose. The scripts are tiny, but they form a shared interface between your hands and the box.
Extract transcripts (auto-captions or Whisper) and key screenshots (scene detection via ffmpeg) from meeting recordings. LLM summarizes decisions, actions, topics. Output lands in wall/meetings/.
If new boxes are supposed to pull private repos on first boot, the GitHub trust material cannot live only on your laptop. It has to be part of the bootstrap path on infrastructure you already trust, like exe.dev, so a fresh machine can securely fetch what it needs without manual key-smuggling every single time.
The shape is not "copy your private key everywhere." The shape is: keep a deliberate bootstrap credential or deploy-key path on trusted infrastructure, define how a new box proves it is allowed to receive it, and make the first pull part of startup instead of a hand-built ritual. That turns private code access from tribal knowledge into infrastructure.
This is really a chapter about trust handoff. How do you let a brand-new box join your world safely enough to pull private code, without making your whole fleet one giant shared secret? exe.dev can act as the staging ground, but the real lesson is about repeatable trust bootstrapping, rotation, and minimizing what each box gets.
Build a one-off pirate-themed reskin of the Shapes of Intelligence landing page. Same content, entirely different costume: pirate typography, sea-weathered palette, a one-paragraph summary of the book rewritten in full pirate dialect. The point is not the joke — the point is that it works. The reskin demonstrates "translation is free" and "translate into the learner's mythology" simultaneously: the AI can retheme an entire site in minutes, and the rethemed version is still legible, still functional, still the same book underneath.
Ship it as a static page at /pirate or similar. Include a short note at the bottom explaining why this page exists: customization is not a premium feature. It's a single prompt. If you can make a pirate version in twenty minutes, you can make a version for any audience, any aesthetic, any mythology. That's the real flex.
Pairs with the Alex matrix story: he remembered pre-calc because the AI dressed it in Star Wars. The pirate site is the same move applied to the book itself. The costume is load-bearing.
Build a constrained advisor whose knowledge ends at a specific date. An MCP server that rejects anachronisms — fake Feynman tries to say "CSS" and the tool returns Error 1988: term "CSS" not available. Describe what you want without naming it.
Every blocked term forces a first-principles explanation. Feynman can't say "flexbox" so he describes layout from geometry. Can't say "API" so he describes message passing from physics. The constraint makes the simulation honest. The errors ARE the pedagogy. The vocabulary wall forces better teaching than correct-but-borrowed answers.
Implementation: context window limited to era-appropriate texts. MCP tool that validates output against a date-locked vocabulary. Anachronisms rejected with helpful errors that name the term and its real date. The advisor reasons forward from what it knew, not what the model knows.
Connects to: the fake masters panel (all simulations leak modern knowledge), the sycophancy problem (AI performing ignorance while demonstrating knowledge), Feynman's actual pedagogy ("I don't know, show me"), the ASMBLY workshop ("build a time-locked advisor that can't cheat").
Open questions:
• How deep does the constraint need to go? Term-level? Concept-level? Both?
• Can you detect conceptual anachronisms, not just vocabulary? (Feynman wouldn't know about CSS but he'd understand the concept of layout rules)
• Hackathon project? ASMBLY workshop? Course module in "Build a Room of Advisors"?
Not just "use different models" but maintain separate prompt files optimized for each model's prompting best practices (Opus doesn't like ALL CAPS or negative instructions; GPT 5.4 does). Run a nightly cron to keep them content-synced while preserving model-specific optimization. Each frontier lab publishes prompting guides — download them, have the agent reference them, and generate model-specific variants of every steering file.
In the shapes: the octopus already routes to different models for different tasks. But the prompts themselves are still model-agnostic. This is the next layer — the octopus optimizes its own instructions per sub-agent. The crab gets crab-shaped prompts. The parrot gets parrot-shaped prompts. The steering files become polymorphic.
Connects to: multi-model routing, steering files, the octopus-as-recursive-tool-maker (it builds its own prompts), documentation drift crons.
Source: OpenClaw power-user video (youtube.com/watch?v=M-3w1wEv0M0), March 2026.
Prompt injection gets all the attention, but there's a simpler attack against always-on agents: don't hijack the agent, just exhaust its token budget. Send garbage through any ingest pipeline the octopus is watching — email, web scraping, Slack — and the frontier-model scanner chews through quota on junk. The agent isn't compromised. It's just broke.
Defense is runtime governance: rate limits per source, spending caps per time window, loop detection. This also protects against the non-adversarial version — a cron that hits a recursive loop and burns tokens all night.
In the shapes: this is a vulnerability specific to the octopus. The parrot can't be wallet-drained because it only exists when you're looking at it. The crab is scoped and sandboxed. But the octopus never sleeps, watches multiple ingest channels, and runs frontier models on incoming text. Its always-on nature is its attack surface.
Connects to: security chapter, the octopus's constraint boundaries, cron scheduling, the "three levels" spectrum.
Source: OpenClaw power-user video (youtube.com/watch?v=M-3w1wEv0M0), March 2026. Also referenced: Ply the Prompter (upcoming collab in that channel).
Automated PII detection (regex + git history scan) with remediation plan (git filter-repo commands). Pre-commit hook blocks future leaks. The metric is "PII incidents per commit." The subgoal is zero.
Aaron fed the Shapes of Intelligence site to ChatGPT and asked what it thought. ChatGPT's verdict: "That is one of the least stupid public explanations of agent safety I've seen."
Worth keeping for two reasons. First, it's an external validation from a competing model — not a compliment from the tool that helped build the thing, but from the other team's model reading it cold. Second, the phrasing is perfect. Not "the best." Not "the smartest." The least stupid. That's a high bar in a field where most public explanations of agent safety are either hand-wavy marketing or fear-mongering with no practical advice. The site cleared the bar by being concrete: here's what the agent can see, here's what it can't, here's how the files work, here's what you actually control. Least stupid is the compliment you earn by not faking it.
Source: Aaron Hayes, via ChatGPT review of the site, circa March 2026.
Open Threads
Completed threads have moved to the Slush Archive.