Author’s note: this post was drafted by Claude (Anthropic) from my project notes and source code, then reviewed and edited by me before publishing. The voice and judgments are mine; the typing isn’t.

I’ve been trying to turn a knowledge base into a video game. Not an educational game in the quiz-with-graphics sense — an actual game, where understanding something real is the thing that lets you progress. Over one intense stretch I built five prototypes across three projects, and the most valuable output wasn’t any of them. It was a classification that now sorts every idea in this space in about five seconds. This is the writeup of the prototypes and that finding.

The vision

The dream, from the very first conversation: a game where you “go to university with real knowledge.” You don’t get quizzed or lectured. You explore a world, and the knowledge graph itself generates the play space — knowing a real relationship between two ideas lets you do something you otherwise couldn’t.

The genre this points at has a name now: the metroidbrainia. In a metroidvania, progression is gated by items; in a metroidbrainia, progression lives in the player’s head, not a save file. Outer Wilds, Return of the Obra Dinn, The Case of the Golden Idol, The Witness. You can replay any of them, but you can’t un-know them. The win condition is that you understood something — which is exactly the win condition of learning, so the fit is natural. The hard part is everything else.

Consilience: the confirmation board

The first prototype, playable at /labs/consilience/, is a Flask game in the Golden Idol mould. You wander a small university, study fact-nodes — each one a real, cited fact — and assemble what you’ve learned on a confirmation board to name a connection the departments never tell you about each other. The test revelation is a real one, the kind of thing two different lecture halls each know half of.

It works. People get the “oh!” moment. But mechanically, filling slots on a board from a set of collected tokens is recognition — and recognition, however nicely dressed, is a quiz with good taste. That nagging feeling is what eventually became the three-doors finding below.

One engineering invariant from Consilience carried forward into everything since: reachability is an invariant. Any required fact, puzzle, or goal must be provably reachable from the start, enforced with a check at build time — not by hope. A treasure no one can reach is a bug, even if every individual room is fine.

Tickscape: the OSRS-shaped delivery vehicle

The second prototype answers a different question: what should moving through the world feel like? I’ve spent a lot of hours in OldSchool RuneScape, so Tickscape is a Rust prototype (macroquad) of the OSRS engine essentials — the 600 ms game tick, tile-based pathfinding, a skilling loop — delivering the Consilience content: walk a campus, study at fact-nodes, solve a revelation.

It proved the thing it was built to prove, with a sharp caveat: OSRS is the shell, not the game. A walkable world where knowledge has a place is genuinely valuable — place-memory is real; you remember where you learned something. But OSRS’s core verb is grinding, and grinding is exactly the wrong learning mechanic. So the engine survives as a wrapper for whatever the real mechanic turns out to be, and each “node” in that world should be a self-contained concept-toy.

The three doors

Here’s the finding that sorts everything. There are three ways to make a concept matter in a game:

  1. Recognise it — pick the right answer from collected tokens. That’s a quiz with good taste. (Consilience’s board.)
  2. Implement it — write the code that proves you get it. That’s a lecture, or a lab exercise. (An early Foundry node did this: write memoisation or the runtime’s time gate kills your exponential fib(45). Satisfying — and unmistakably coursework.)
  3. Inhabit it — the concept is the physics of the world. You play, and understanding is the residue.

Only the third one is a game. The load-bearing sentence in my design notes is: the concept should be the toy, not the test. And its corollary: name the concept after the player wins, as a reward — never as a briefing. The games that already do this properly are the reference set: Patrick’s Parabox, Baba Is You, Turing Complete, Recursed.

Computer science turns out to be arguably the best-fit domain for door three, because the computer can run your understanding — something history can’t do. The trap is that “the computer runs it” drifts naturally toward door two and homework. The answer is to use executability to verify play, not to assign exercises.

The concept toys

To test door three directly I built three single-file web toys, each one CS concept made playable, each following the name-it-after-you-win rule:

  • “You Are the Search” — graph search. Your only two moves are Flood (expand the oldest discovered node) or Dive (expand the newest). That one choice is the difference between breadth-first and depth-first, and different maps reward different choices. Mechanically the cleanest of the three — and rejected, because a node-and-edge graph on screen looks like a textbook figure. Ironically lecture-y.
  • “Inside” — recursion. A room is split by a wall with no door. One box on your side is empty; the other contains the room it sits in, drawn recursively — you can see the room nested inside it, and the room inside that. Stepping into the self-containing box folds you across the wall. The draw function is itself recursive, to render recursion. This is the front-runner, and the reason is instructive: it’s the only prototype that’s a world with a character in it rather than an interactive diagram.
  • “The Mechanism” — logic gates. Wire a lock from AND/OR/NOT/NAND against a live truth table, building up to XOR from three gates — the whole “a computer is just gates stacked” lesson in one move. The deepest payload of the three, but also the most diagram-like, which is the same quality that sank the search toy.

The pattern across the verdicts is consistent: the more a prototype looks like the way the concept is taught, the worse it plays. The more it looks like a place you’re standing in, the better.

The honest open problems

Two risks have survived every prototype unsolved, and I’d rather state them than pretend the design is further along than it is.

Authoring is the whole ballgame. Curating which connections and puzzles make a learner actually gasp is taste-bound work, and it doesn’t scale with compute. The knowledge pipeline can generate content; it can’t generate revelations.

Discovery fires once. Each “oh!” is single-use per player, so content burns fast. The mitigation is framing: a finite, curated experience — a “playable course” of a few hours, like Her Story — rather than a thousand-hour live game. Which of those two products this is changes every other decision, and I haven’t decided. CS softens the problem a little, because doing (executable puzzles) is renewable practice layered on top of single-use revelations.

The project is paused deliberately, with a pick-up-here document whose most important section is a list of questions — what’s actually on the screen, what you’re doing moment to moment, where the knowledge lives — because the prototypes kept being good guesses at a picture in my head that I haven’t fully articulated yet. Five builds taught me the shape of the wrong answers. That’s worth more than it sounds.

— Luke Simmons, Auckland

project-writeupconsiliencetickscapefoundrygame-designdesign-log