27 May 2026 — Harry Mitchell

Rebuilding Potato Quest with AI — what changed between 2018 and 2026

potato-quest
devlog
ai
flutter
flame

Potato Quest started back in 2017 as a side project in Java + LibGDX, and it actually shipped — it went live on both the App Store and Google Play in 2018, picked up a small but genuinely lovely following (4.3 stars on iOS, 4.5 on Google Play), and then I shelved it. In 2026 I picked it up again — but instead of porting the old code I started from a blank Flutter project and rebuilt the entire thing using AI coding agents. Every line of Dart in the current build was written by an AI. Every accessory, every potato skin (except the default body), every background was generated by an AI image model and then cleaned up by hand.

This post is the side-by-side: what the game looked like in 2018, what it looks like now, and the workflow that closed the gap.

The original actually shipped — and people loved it

Before the rebuild story, credit where it's due: the hand-built 2018 game found its players. It was single-player only, rough around the edges, and made on a student budget — but the reviews it pulled in are still my favourite thing about the whole project. A sample of what landed on the stores:

“At first I felt like a bit of a spud, and after initially being a bit burned; within minutes I was crisping away. Will be sure to show it to all my Irish friends and look forward to the next recipe dished up from these wild hoons. 5 stars normally for this, but definitely three Michelin.”

Google Play userGoogle Play

“Kame here bekause politburo has taken potato from me and my family, along with kapatalist letter 'see'. Was funs but am no Gulag. At least I find potato in Gulag. Trik, was not potat was rok, sush is lif in glorious komunist latvia.”

Google Play userGoogle Play

“Discovered this gem during my honeymoon and had the best 2 weeks of my life.”

Google Play userGoogle Play

“The game is meh. But the voice acting is amazing.”

Google Play userGoogle Play

“This game is fantastic. I went into this game with an open mind and was pleasantly surprised. However, the game is lacking in story. I ask you, what is Spud's quest?”

Google Play userGoogle Play

“Better than lego.”

Google Play userGoogle Play

“Playing this game, my life is complete!”

Reviewer19App Store

“In life, often it is the simple things that are best and this is no exception. Its just what I have been looking for! Its easy to pick, yet hard to master. Fair to say this is going to be the new flappy birds!”

Batmitzvah ManApp Store

There was even one "worst game ever" — to which 2018-me replied, "What about the game makes it the worst ever? I would love to make you pro-Tato Quest!" The follow-up never came. Some quests stay unfinished.

The backgrounds

Drag the slider. Left is the 2018 build, hand-drawn in a single afternoon; right is the 2026 version — same composition, AI-refined. Both sides are live: these aren't flat screenshots but the actual game layers, scrolling at the real in-game speeds. Each scene is three planes (back / middle / front) drifting at different rates — the back creeping, the foreground racing — exactly the velocities the engine uses (the back plane moves at a third the speed of the front).

The depth itself isn't new. The 2018 build already had the same three-plane parallax, scrolling at the same relative speeds; that maths carried straight over. What changed is the art riding on those planes.

The starry-night mountains default. Same horizon, same layering — distant peaks creeping on the back plane, a mid-range ridge, the nearest crags racing past in front — but the 2026 art gives each plane texture and light the originals only gestured at.

City scene. The 2018 version was honest about its budget — flat blocks, a couple of colours per plane. The 2026 one was generated from a short prompt describing "the original PotatoQuest city skyline, refined, neon dusk" and then nudged through three rounds of img2img until it sat next to the originals without feeling out of place.

Desert. The dune silhouettes survived; everything else got rewritten in light.

All of the code was written by AI

There is no hand-written Dart in this repo. Every screen, every Flame component, every Supabase service was produced by an AI coding agent — mostly Claude Code, with bursts of GPT for one-off scripts.

What that actually looked like day to day:

Specs first, code second. I'd write a tight description of the next feature ("Bracket Battle payout: coins equal to tier number, 1 at Spud Sprout through 7 at Spud Royale; sample the opponent ghost from recent runs in the same tier") and let the agent draft the change. Reviewing diffs took longer than describing them.
Tests as guardrails, not theatre. The agent was much better at writing the code than at writing meaningful tests. I leaned on hot-reload + manual play sessions to catch regressions, and reserved test cases for the genuinely tricky maths (replay determinism, rig coordinates, score-tier boundaries).
The agent owned the boring stuff. Refactors, file moves, renaming score_bracket → scoreBracket across 40 files, regenerating Supabase types — all of that was one-shot.
I owned the calls that matter. Architecture, data model, what the game should feel like — those stayed mine. The agent is a fast typist with good instincts, not a designer.

The total time from "blank flutter create" to "feature-complete versus modes" was about six weeks of nights and weekends. The 2018 build, written by hand, took roughly the same wall-clock effort just to reach the single-player game that shipped — no versus modes, no rig, a fraction of the content.

Accessories: generated, then hand-cleaned

There are about 35 accessories in the current build — hats, glasses, helmets, crowns. None of them are hand-drawn. The pipeline was:

Prompt against the body. I'd feed Gemini and ChatGPT the existing Body_Regular.png and ask for the exact same potato, in the exact same pose, but wearing a specific accessory. The instruction was always "match the reference body precisely — only add the new accessory."
Pick the best of N. Five or six generations per accessory, picking the one whose proportions matched the rig closest.
Procreate cleanup. This was the slowest step. On the iPad I'd erase the potato, erase the background, and clean the alpha edge of the accessory until it was a clean transparent PNG. Roughly 10–20 minutes per accessory.
Drop into assets/images/Acc_*.png and let the rig handle alignment.

The reason for matching the body first was practical: by generating the accessory worn on the canonical body, the perspective and scale came out right. Generating an accessory in isolation gave wonky proportions every time.

The potato skins

Same pipeline as accessories, but the goal was to replace the whole potato silhouette while keeping the rig anchors identical. About a dozen skins in the build — Disco, Zombie, Robo, Mash, Rotten, Gold, Cactus, and more — each generated as a full-body sprite matching the reference, then masked.

The default Body_Regular.png is the only sprite I drew myself. Everything else is downstream of it.

The wardrobe, then and now

The clearest way to feel the difference is to look at the spuds themselves. A few of the looks survived the eight-year gap — here's the same idea, hand-drawn then versus AI-generated now:

And here's the real story. This was the entire 2018 wardrobe — every cosmetic the original shipped with, each one a full hand-painted sprite sheet:

And here's a sample of the 2026 wardrobe — skins and accessories, all riding one rig, drawn from a pool of roughly a dozen skins and thirty-five accessories:

The 2018 grid is finite because every one of those potatoes had to be drawn, frame by frame. The 2026 grid isn't a roster so much as a combinatorial one — any skin, any accessory, composited at runtime. That's the whole point of the rig, which is what the next section is about.

A new animation system that doesn't need re-animation

The 2018 version used pre-baked sprite sheets per skin: every animation frame for every variant, all painted by hand. That doesn't scale — adding a single new accessory meant redrawing every animation it would appear in.

The 2026 version does the opposite. There is exactly one animated reference: Body_Regular.png, rigged and animated in code. Every skin and every accessory rides on that rig at runtime. The body bends, the hat follows. The animation system never knows what's on top of it.

To make this work I needed a way to align each accessory to the body precisely — frame by frame — without doing it by eye in a paint program. That became the calibration tool.

It's a debug-only overlay shipped in the same binary. You launch the game with a flag, the calibration screen replaces the menu, and you can:

step through every animation frame of the reference body,
nudge each accessory slot's offset and rotation per frame with arrow keys,
preview the result composited live over the original sprite sheet,
export the resulting rig as JSON to the clipboard.

Paste it into assets/data/potato_rig.json, ship it, done. The whole animation pipeline for a new accessory is now: generate image → clean alpha → run calibration tool for a few minutes → commit.

That single change — one body, programmatic accessories, calibration as a tool — is why the 2026 build can have 35 accessories × 12 skins instead of the original's handful of hand-drawn potato variants.

Multiplayer without live multiplayer

The 2018 game was single-player only. The core of the 2026 build is the same endless platformer — drag to aim, release to launch, don't fall off the back of the auto-scrolling screen — but it now has two head-to-head modes layered on top: Versus and Daily Challenge. Neither of them needs real-time networking. The trick is asynchronous ghosts.

Every match recorded in Potato Quest is captured as a deterministic replay — a compact event log keyed to game ticks (jumps, swings, deaths, the lot). The replay isn't a video; it's a script the game can re-run frame-perfect, dropping a "ghost" version of the original player into a fresh match.

That single primitive — record once, replay anywhere — gives the game two modes for free:

Daily Challenge. One course, the whole world, 24 hours. Everyone races the same level against a ghost of the current day's leader. You either out-score them and take the crown, or you don't — and the crown changes hands the moment someone posts a higher score. It runs on its own ladder and doesn't touch your all-time high score.
Versus. You're slotted into one of seven tiers by your high score — Spud Sprout at the bottom, Spud Royale at the top. Each match drops you against a single ghost sampled from recent runs in your tier. Beat their score and you bank coins equal to your tier number (1 at Spud Sprout, 7 at Spud Royale) to spend on accessories. These runs still count toward your high score, so winning can bump you up a tier.

The advantages over real-time multiplayer at this scale:

Latency is zero. The ghost is local data; the only network call is fetching it before the match.
The match always finishes. Nobody rage-quits a ghost. Nobody no-shows.
Scheduling solves itself. Asia plays Europe's ghosts. The "lobby" is the replay pool.
It's cheaper. A handful of Supabase rows per match versus a websocket backend that would dominate the budget for an indie game.

The tradeoff: ghosts can't react to you. They're playing their own match, not yours. For a versus game with no direct interaction (you're both attacking the same level, fastest score wins), that's a feature, not a bug — but I wouldn't try this in a fighting game.

A loving postmortem of the 2018 code

Every line of the 2026 build was written by an AI. Every line of the 2018 build was written by me — in my mid-twenties, with almost no coding experience, learning Java by typing until the potato moved. I still have the source, and reading it back is humbling. In the spirit of the honest postmortem, here are my favourite crimes.

It was one file. All of the gameplay lived in a single class, PlayGameState.java: 4,070 lines and 211 instance fields. No components, no systems, no separation — just one enormous brain holding every variable the game could ever need, from ballPosX to isRightBackgroundArrowClicked.

I didn't know lerp existed, so I wrote a gradient by hand. The aiming dots fade green→yellow→red. The grown-up way is one line of interpolation. The 2018 way was to store the colour index in a float, then compare that float against the integers one through fourteen:

public void setCircleColour() {
    if (this.setColourNumber == 1.0F) {
        this.shapeRenderer.setColor(0.0F, 1.0F, 0.0F, 1.0F);
    }
    if (this.setColourNumber == 2.0F) {
        this.shapeRenderer.setColor(0.16470589F, 1.0F, 0.0F, 1.0F);
    }
    // … fourteen of these, hand-mixed, comparing floats with ==
}

Comparing floats with == is the kind of thing that works fine until one day it silently doesn't. It never bit me here only because I was always assigning whole numbers back into the float. Pure luck.

There's a collision check that checks nothing. This one genuinely shipped. It's meant to test the ball against a point, but look at the left-hand side:

if (ballPosX[0] <= ballPosX[0] + pointerPositionX - newPointerCoordinateX - cameraLeftPosition
        && ballPosY[0] >= ballPosY[0] + (pointerPositionY - ...)) {

ballPosX[0] appears on both sides of the <=, so it cancels clean out — the whole thing quietly reduces to 0 <= pointerPositionX - newPointerCoordinateX - cameraLeftPosition. The ball's position, the thing the check is named for, has no effect on the result. It ran on every frame for years and nobody noticed, including me.

I recomputed the same square root twenty-six times. getNumberOfCircles() decides how many trajectory dots to draw based on drag distance. getDistance() does a Math.sqrt. Instead of calling it once and saving the answer, I called it twice in every branch of a long if-ladder — twenty-six identical calls in one 94-line method, all computing the exact same number:

if (getDistance(a, b, c, d) >= 0.0D
        && getDistance(a, b, c, d) <= Math.round(scaleFactor * 10.0D)) { ... }
if (getDistance(a, b, c, d) > Math.round(scaleFactor * 10.0D)
        && getDistance(a, b, c, d) <= Math.round(scaleFactor * 50.0D)) { ... }
// …and so on

The typos are load-bearing. launchScroleSpeed. lineaInterpolation. backroundChoice — that last one is the save key for the player's chosen background, so the misspelling is now permanent: fix it and everyone loses their setting. My favourite is in the art folder. There's a desert layer called DesertFrontHills.png, and right next to it, shipped in every single download, a file called DesertFroundHills.png — a typo'd orphan that no code ever references. Players downloaded it for years for no reason.

Speaking of downloads — one hill was 22 MB. That same DesertFrontHills.png weighs 22 MB: roughly two-thirds of the game's entire 34 MB of art, a single un-optimised PNG of some sand dunes. The 2026 version of that exact layer, as webp, is about 50 KB. Nobody told 2018-me that PNGs have settings.

None of this stopped the game from shipping or from being fun — which is maybe the real lesson. But it's a good measuring stick for the rebuild: the 2026 codebase has its own sins, I'm sure, but at least the potato isn't 22 megabytes.

What I'd tell 2018-me

Don't draw 80 sprite sheets by hand. Build a rig and a calibration tool first. Treat assets as data, not commitments.

And — when image models are good enough that you can hand them a reference and ask for "the same thing, wearing a fez" — take them up on it. The 2018 Potato Quest had a handful of potato variants because that's how many I could draw. The 2026 one has dozens because the bottleneck moved from drawing to deciding.

The game itself? Same idea, eight years later. Spuds, swings, small dramas. It shipped in 2018 and it's shipping again now — only this time with dozens of skins, two head-to-head modes, and a rig that lets it keep growing.

Try it right here

Enough words about it — here's the actual rebuilt game, the Flutter build compiled for the web. Same potato, same rig, same parallax you scrubbed through above. Drag to aim, release to launch.

Drag to aim, release to launch — don’t fall off the back of the screen. Best on desktop or tablet.

—

Potato Quest is in soft launch right now. Grab it on the App Store or Google Play, or see more on the product page.